Dear Albert, Adam, Thank you for the helpful review. I should have more training on C++... Please let me spend to improve the patch.
> (Another easy option albeit with some > overhead to get automatic destruction would be to wrap the actual data > inside a std::shared_ptr and hence use reference counting.) Yeah, from some posts in stackoverflow, I found about shared_ptr... But it seems that current poppler does not use shared_ptr at all. Is this a coding policy? Regards, mpsuzuki Adam Reichold wrote: > Hello > > Am 02.01.2018 um 01:06 schrieb Albert Astals Cid: >> El diumenge, 31 de desembre de 2017, a les 0:45:16 CET, suzuki toshiya va >> escriure: >>> Hi, >>> >>> I've tried to implement the suggestion, I attached my current patch. >>> >>> As suggested, the most part is just copied from Qt frontend and renamed, >>> except of one point: TextBox.nextWord() looks slightly confusing, >>> because the returned object is a pointer to TextBox. I wrote >>> text_box.next_text_box() and a macro text_box.next_word() which >>> calls next_text_box() internally. >>> >>> Another point I want to discuss is the design of the list give by >>> poppler::page::text_list(). In Qt frontend, Page::textList() returns >>> QList<TextBox*>. For similarity, current patch returns >>> std::vector<text_box*> for similarity to Qt frontend. >>> >>> But, if we return the vector of pointers, the client should destruct >>> the objects pointed by the vector, before destructing vector itself. >>> Using a vector of text_box (not the pointer but the object itself), >>> like std::vector<text_box>, could be better, because the destructor >>> of the vector would internally call the destructor for text_box object. >>> (Qt has qDeleteAll(), but I think std::vector does not have such). >>> If I'm misunderstanding about C++, please correct. >> First you need to fix the text_box class, you either need to make it >> uncopiable (and thus making your suggestion not possible) or you need to >> make >> the copy and assignment operators not break your class since at the moment >> the >> default operators will just copy the pointer and everything will break >> because >> you will end up with a dangling pointer. >> >> If you go for the second option you'll want the move constructor and move >> operator too. > > Should the first option also be viable if text_box becomes a move-only > type, i.e. no copying but move construction and assignment? I think > std::vector can handle it. (Another easy option albeit with some > overhead to get automatic destruction would be to wrap the actual data > inside a std::shared_ptr and hence use reference counting.) > > Best regards, Adam. > >> Cheers, >> Albert >> >>> Regards, >>> mpsuzuki >>> >>> Albert Astals Cid wrote: >>>> El dimecres, 27 de desembre de 2017, a les 12:26:25 CET, Jeroen Ooms va >>>> >>>> escriure: >>>>> Is there a method in poppler-cpp to extract text from a pdf document, >>>>> including the position of each text box? Currently we use page->text() >>>>> with page::physical_layout which gives all text per page, but I need >>>>> more detailed information about each text box per page. >>>> You want to code the variant of qt5 frontend Poppler::Page::textList() for >>>> cpp frontend, it shouldn't be that hard getting inspiration (i.e. >>>> almost-copying) the code, do you have time for it? >>>> >>>> Cheers, >>>> >>>> Albert >>>>> _______________________________________________ >>>>> poppler mailing list >>>>> poppler@lists.freedesktop.org >>>>> https://lists.freedesktop.org/mailman/listinfo/poppler >>>> _______________________________________________ >>>> poppler mailing list >>>> poppler@lists.freedesktop.org >>>> https://lists.freedesktop.org/mailman/listinfo/poppler >> >> _______________________________________________ >> poppler mailing list >> poppler@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/poppler >> > _______________________________________________ poppler mailing list poppler@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/poppler