It's not hard to query PDFs with SPARQL. All you have to do is extract the metadata from the document and turn it into RDF, if needed. Lots of programs extract and display this metadata already.

No, I don't think that viewing this issue from the reviewer perspective is too narrow. Reviewers form a vital part of the scientific publishing process. Anything that makes their jobs harder or the results that they produce worse is going to have to have very large benefits over the current setup. In any case, I haven't been looking at the reviewer perspective only, even in the message quoted below.

peter

PS: This is *not* to say that I think that the reviewing process is anywhere near ideal. On the contrary, I think that the reviewing process has many problems, particularly as it is performed in CS conferences.


On 10/06/2014 09:19 AM, Martynas Jusevičius wrote:
Dear Peter,

please show me how to query PDFs with SPARQL. Then I'll believe there
are no benefits of XHTML+RDFa over PDF.

Addressing the issue from the reviewer perspective only is too narrow,
don't you think?


Martynas

On Mon, Oct 6, 2014 at 6:08 PM, Peter F. Patel-Schneider
<pfpschnei...@gmail.com> wrote:


On 10/06/2014 08:38 AM, Phillip Lord wrote:

"Peter F. Patel-Schneider" <pfpschnei...@gmail.com> writes:

I would be totally astonished if using htlatex as the main way to produce
conference papers were as simple as this.

I just tried htlatex on my ISWC paper, and the result was, to put it
mildly,
horrible.  (One of my AAAI papers was about the same, the other one
caused an
undefined control sequence and only produced one page of output.)
Several
parts of the paper were rendered in fixed-width fonts.  There was no
attempt
to limit line length.  Footnotes were in separate files.



The footnote thing is pretty strange, I have to agree. Although
"footnotes" are a fairly alien concept wrt to the web. Probably hover
overs would be a reasonable presentation for this.


Many non-scalable images were included, even for simple math.


It does MathML I think, which is then rendered client side. Or you could
drop math-mode straight through and render client side with mathjax.


Well, somehow png files are being produced for some math, which is a
failure.  I don't know what the way to do this right would be, I just know
that the version of htlatex for Fedora 20 fails to reasonably handle the
math in this paper.

My carefully designed layout for examples was modified in ways that
made the examples harder to understand.


Perhaps this is a key difference between us. I don't care about the
layout, and want someone to do it for me; it's one of the reasons I use
latex as well.


There are many cases where line breaks and indentation are important for
understanding.  Getting this sort of presentation right in latex is a pain
for starters, but when it has been done, having the htlatex toolchain mess
it up is a failure.

That said, the result was better than I expected.  If someone upgrades
htlatex
to work well I'm quite willing to use it, but I expect that a lot of work
is
going to be needed.


Which gets us back to the chicken and egg situation. I would probably do
this; but, at the moment, ESWC and ISWC won't let me submit it. So, I'll
end up with the PDF output anyway.


Well, I'm with ESWC and ISWC here.  The review process should be designed to
make reviewing easy for reviewers.  Until viewing HTML output is as
trouble-free as viewing PDF output, then PDF should be the required format.

This is why it is important that web conferences allow HTML, which is
where the argument started. If you want something that prints just
right, PDF is the thing for you. If you you want to read your papers in
the bath, likewise, PDF is the thing for you. And that's fine by me (so
long as you don't mind me reading your papers in the bath!). But it
needs to not be the only option.


Why?  What are the benefits of HTML reviewing, right now?  What are the
benefits of HTML publishing, right now?  If there were HTML-based tools that
worked well for preparing, reviewing, and reading scientific papers, then
maybe conferences would use them.  However, conference organizers and
reviewers have limited time, and are thus going for the simplest solution
that works well.

If some group thinks that a good HTML-based solution is possible, then let
them produce this solution.  If the group can get pre-approval of some
conference, then more power to them.  However, I'm not going to vote for any
pre-approval of some future solution when the current situation is
satisficing.

Phil


peter



Reply via email to