#303: plotextrator: support LaTeX context extraction
-------------------------+--------------------------------------------------
 Reporter:  jlavik       |       Owner:  jlavik
     Type:  enhancement  |      Status:  new   
 Priority:  major        |   Milestone:  v1.0  
Component:  MiscUtil     |     Version:        
 Keywords:               |  
-------------------------+--------------------------------------------------
 In addition to extracting captions from the plots and figures, which the
 plotextractor currently does, it is also a good idea to extract the
 context for when the image is referenced in the fulltext LaTeX.

 The amount of text to extract can be limited to one sentence each way in
 addition to the sentence the reference was found. Things like new
 paragraphs or complex LaTeX structures (with \begin\end, figure tags etc.)
 should be excluded, but one would still like to keep simple tags like
 \cite,\ref etc.

 The context for each image can be saved separately in a file and uploaded
 via FFT as a subformat of the image (i.e. {{{fig1.png.context}}}) using
 {{{.png;context}}} in {{{$f}}} together with 'HIDDEN' keyword in {{{$o}}},
 to hide it from metadata.

 This can then be used when searching for plots, indexing the same way as
 with fulltexts.

 (Note: Extracting this from PDF's are just as relevant, but will be
 introduced at a later date.)

-- 
Ticket URL: <http://invenio-software.org/ticket/303>
Invenio <http://invenio-software.org>

Reply via email to