On Tuesday, May 26, 2020 at 7:23:42 AM UTC-4, David Bailey wrote:
>
> On 25/05/2020 23:42, Ben wrote:
>
>  You're totally correct -- Latex is ambiguous. I don't find your 
>> observation discouraging since it is perfectly reasonable. 
>>
>
> The issue I'm interested in tackling is the conversion of math presented 
> in Physics papers (e.g., .tex files on arxiv.org) to a semantically 
> meaningful and unambiguous representation (e.g., Sympy). 
>
> This issue would be moot if Physics papers were written in Sympy.  I don't 
> have insight on how to construct incentives that would lead to use of Sympy 
> in Physics papers, so I'm working on the Latex-to-Sympy approach. 
>
> Right - well in that case, maybe a system of hints that the user could add 
> to your parser, would be really useful. For example if a user could tell 
> your parser that superscripts were usually tensor subscripts rather than 
> exponents (or alternatively that certain symbols used as superscripts would 
> never mean exponents) you could come out with a better translation. Another 
> useful hint, might be a list of the multi-letter symbols in use - sin, cos, 
> exp, ln etc. so that you could resolve your ambiguity of what ab means - I 
> mean sometimes sin(x) might mean s*i*n(x) and that could be handled by user 
> specifying that only certain  multi-letter symbols were in use.
>
> David
>
>
>
Yeah, in talking this over with a collaborator about this, we think there 
are various sources to help with parsing. 

   - within the math latex string to parse, what can be deduced about the 
   expected context?
   - given other math expressions in the same paper, what would be 
   consistent?
   - given the text in a paper surrounding the math expressions, what would 
   be expected based on keywords?
   - given other papers in the same domain or based on citations, what 
   would be likely?
   - what is statistically likely give the corpus of all articles?

This is, in some sense, the same process a human goes through to decode the 
intended meaning of any given math expression in a scientific paper. We are 
looking to encode that process as a Python program. (That's beyond the 
scope of Sympy but is context for the issue.)
 

-- 
You received this message because you are subscribed to the Google Groups 
"sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sympy/c66e9f08-34ca-42f9-89c5-0ae5492c0686%40googlegroups.com.

Reply via email to