Sorry, I don't want to be a pain in the ass, but do none of you guys have
an idea how I can handle this problem? Either with podofo or with some
other tool.
2012/7/10 Jean-Philippe Green <[email protected]>
> Forgot to include file
>
>
> 2012/7/10 Jean-Philippe Green <[email protected]>
>
>> The problem with the "PDF to Text" tools I've tried is that I don't know
>> where the text is placed. I know under what week, because it first tells me
>> the intervall of dates and then the shift times, but I don't know what day
>> it's under (see included file). So I figured I must know the placement of
>> the characters to figure out under what week day it is, and use that as an
>> offset from the first date in that week. For example, if you look at
>> 12-07-04 we would first need to know the base date, ie 12-07-02 and then
>> add 2 because it's placed under wednesday and not monday. You can look how
>> I handle that in work_day.cpp.
>>
>> Could you please elaborate a little more on how pdf's work? Is it that
>> every character are at different places in the document? Although it's
>> annoying I guess I'll have to deal with it in order to know where the text
>> is placed. Pdf does tell where every character/string is placed, right?
>>
>>
>> 2012/7/10 Leonard Rosenthol <[email protected]>
>>
>>> The issue you are going to run into is that a PDF file isn't organized
>>> in the same way that you are trying to work with it. You won't just get "a
>>> bunch of text"…
>>>
>>> You would probably be better served finding a nice "PDF to Text"
>>> application or library and then calling that from your own program.
>>>
>>> Leonard
>>>
>>> From: Jean-Philippe Green <[email protected]>
>>> To: "[email protected]" <
>>> [email protected]>
>>> Subject: [Podofo-users] Parsing a pdf-schedule to create a .ical or .csv
>>>
>>> Hi there!
>>>
>>> Where I work we get our schedules in visual pdf files, which really
>>> bothers me because I want to sync it with my calendar (google calendar in
>>> my case). So I decided write a program that will parse the pdf file and do
>>> a icalendar or csv file. Although I don't know much c++, I decided to do it
>>> in that language to get to know it better.
>>>
>>> As for now, I have done a "work_day" class which holds all information
>>> for one work day, such as date, start time, end time and name of the
>>> shift. Then I have a main file which reads the pdf and writes it to a
>>> csv (the format will be optional later). But when it comes to reading the
>>> pdf it's getting really hard. I succeed to get the first stream object and
>>> decode it, but then I have no idea what to do with it.
>>>
>>> I'm not sure if I completely understand how I'm supposed to use PoDoFo,
>>> so I would be pleased if I could get some directions to how I'm supposed to
>>> tackle this problem. I would also be happy if you told me how I'm supposed
>>> to code in c++, because I'm sure my coding style isn't following the
>>> "standard" way of doing things. I'm including my schedule and the cpp
>>> files. You shouldn't need to read work_day.cpp and csv_schedule.cpp, but I
>>> include them in case you're curious.
>>>
>>> Anyways, thanks for doing this library. I understand that a lot of work
>>> has been put into this.
>>>
>>> PS.
>>> If you wonder why I have a French name and the pdf includes a lot of
>>> Swedish, it's because I have a French mother and I live in Sweden.
>>>
>>
>>
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users