Dear mpsuzuki,

Thanks for your clarification. I am using pdfimages mainly for processing 
scanned books. I don't think I will have to process books with more than 3,333 
pages (that will render more than 10,000 files in the worst case). Therefore 
%04d is enough for my current use, however do I have to modify the code and 
recompile pdfimages myself? 

But in the long run, I think it is better to have an option for a user to 
specify the numbering format in the output sequence.

Thanks,
Abi

Jan 11, 2019, 12:52 PM by [email protected]:

> Dear Abigaile,
>
>> does that mean there is a way to specify numbering format already?
>>
>
> No. What I meant was...
>
> * If there is any existing parser for the user-defined numbering format (out 
> of
> pdfimages, but in poppler), it would be possible for somebody to write a 
> patch.
> * But if there is no, the discussion about the syntax would be needed for 
> first.
>
> Or, "if the total number of the images exceed 1000, the numbering should be
> %04d, we do not need the interface to specify the numbering format" would be
> another solution. how do you think about?
>
> Regards,
> mpsuzuki
>
> Abigaile Johannesburg wrote:
>
>> Dear mpsuzuki,
>>
>> Thank you for quoting the source file regarding numbering scheme.  When you 
>> say
>>
>> "good syntax to specify numbering format, if possible, which is already used 
>> by poppler'suser interfaces."
>>
>> does that mean there is a way to specify numbering format already?
>>
>> Thanks,
>> Abi
>>
>> Jan 10, 2019, 12:49 AM by >> [email protected] 
>> <mailto:[email protected]>>> :
>> Dear Abigaile,
>>
>> At present, 3-digit-numbering is hardwired, like, this
>>
>> https://gitlab.freedesktop.org/poppler/poppler/blob/master/utils/ImageOutputDev.cc#L83
>>  
>> <https://gitlab.freedesktop.org/poppler/poppler/blob/master/utils/ImageOutputDev.cc#L83>>>
>>  <>> 
>> https://apac01.safelinks.protection.outlook.com/?url=https://gitlab.freedesktop.org/poppler/poppler/blob/master/utils/ImageOutputDev.cc#L83&data=02|01|[email protected]|b9caafd0af1d488d21bb08d677e4edd8|c40454ddb2634926868d8e12640d3750|1|0|636828222476265275&sdata=0/kYtZimHm+jmzXnamD/nyplO83WOZr4e5BqoHyn4f0=&reserved=0
>>  
>> <https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fpoppler%2Fpoppler%2Fblob%2Fmaster%2Futils%2FImageOutputDev.cc%23L83&data=02%7C01%7Cmpsuzuki%40hiroshima-u.ac.jp%7Cb9caafd0af1d488d21bb08d677e4edd8%7Cc40454ddb2634926868d8e12640d3750%7C1%7C0%7C636828222476265275&sdata=0%2FkYtZimHm%2BjmzXnamD%2FnyplO83WOZr4e5BqoHyn4f0%3D&reserved=0>>>
>>  >
>>
>> void ImageOutputDev::setFilename(const char *fileExt) {
>> if (pageNames) {
>> sprintf(fileName, "%s-%03d-%03d.%s", fileRoot, pageNum, imgNum, fileExt);
>> } else {
>> sprintf(fileName, "%s-%03d.%s", fileRoot, imgNum, fileExt);
>> }
>> }
>>
>> I want to know whether good syntax to specify numbering
>> format, if possible, which is already used by poppler's
>> user interfaces.
>>
>> Regards,
>> mpsuzuki
>>
>> Abigaile Johannesburg wrote:
>> Hello,
>>
>> The default output numbering of pdfimages is 3 digit, e.g, 
>> image-root-nnn.xxx. But if there are more than 1,000 ouput images, there 
>> will be files image-root-nnn.xxx (3 digit number sequence) and 
>> image-root-nnnn.xxx (4 digit number sequence). When processing book images 
>> in bash, the ordering needs a fix. At the moment I use rename
>>
>> rename 's/img-([0-9]{3}).pbm/img-0$1.pbm/' *.pbm
>>
>> Therefore I was wondering if there is a way to specify the format of output 
>> numbering directly in pdfimages.
>>
>> Thanks,
>> Abi
>>

_______________________________________________
poppler mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/poppler

Reply via email to