On Mon, Apr 16, 2018 at 7:08 AM, <[email protected]> wrote: > Guys hi, > > From docs (Importing External Data in Beancount > <https://docs.google.com/document/d/11EwQdujzEo2cxqaF5PgxCEZXWfKKQCYSMfdJowp_1S8/edit#heading=h.gmhz8vh65l8g>) > i've understood that PDF is not easy format to work with. > But how other formats compare? >
PDF contains blocks of text with their coordinates and a bunch of PostScript formatting directives (it's a fun language, you should learn it) and other binary objects. Split lines might generate multiple blocks of text. It's a drawing. There isn't much structure to read, if you embark on that project, you have to use a bunch of heuristics to figure out the original structure from the blocks of text and their location. It's not quite a research problem but it's not a job you'll ace in one or two weekends of hacking (though it might be very rewarding, I have an incomplete stab at extracting tables from PDF that looks promising so I have a feeling it's definitely doable). In chase bank i can choose - CSV, QFX, QIF, IIF or QBO > <https://puu.sh/A4i7B/207ec565ec.png> > CSV or QIF are your easiest ones here. QFX and QBO are variants of an XML format called OFX with a large set of tags which are often produced inconsistently by the same Java programmers I alluded to in the other thread. (Or they might have been .NET programmers, if I'm not mistaken OFX was born in the guts of the evil empire itself, but probably before .NET days.) Which one is better? > If all have the same numbers, the simplest one. If some have more information, judge whether the extra data is something you want and worth the extra hacking effort. I'd shoot for CSV myself (I like it simple and I prefer spending weekend time in the kitchen than in front of the computer). -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhOqzCrkxg54uvp9w-ih_W6n47_uYVqM%2Bk8EjUpzMkykWQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
