It sounds to me that converting 2000 files in an hour is pretty good...
1.8 seconds per file.
My suggestion is put the files on more than one computer and run them
simultaneously. If you have a million files, you know it is going to
take a long time to create PDFs out of them.
You'll save much more time by splitting up the load into multiple
computers than you will with fiddling with anything below.
Thanks,
Daniel Gibby
<mailto:[email protected]>On 7/29/2014 9:15 AM, Basharat Ali wrote:
Hi,
I am using the PDFBOX utility to convert TXT to PDF files. I have developed
script as under:
echo " Remove Old TXT File List " >> $LogFileDir/ConvertTxtToPdf.log
rm $ConversionScriptDir/TxtFileList.out
echo " Remove Old PDF File List " >> $LogFileDir/ConvertTxtToPdf.log
rm $ConversionScriptDir/PDFFileslist.out
echo " Make List of TXT Files we are going to convert to PDF " >>
$LogFileDir/ConvertTxtToPdf.log
ls -a $TxtFilesDir|grep .TXT > $ConversionScriptDir/TxtFileList.out
echo " TXT File Listing is Complete " >> $LogFileDir/ConvertTxtToPdf.log
echo " Reading TXT File Listing " >> $LogFileDir/ConvertTxtToPdf.log
touch $ConversionScriptDir/PDFFileslist.out
while read line;
do
PDFOutFile=`echo $line|cut -d '.' -f 1`
java -jar $PdfConvertorDir/pdfbox-app-1.8.6.jar TextToPDF
$PdfFilesDir/$PDFOutFile.PDF $TxtFilesDir/$line
echo " TXT File Converted to PDF = $line " >>
$ConversionScriptDir/PDFFileslist.out
done < $ConversionScriptDir/TxtFileList.out
echo " All TXT to PDF Conversion is completed successfully. Please verify the PDF
Files at:: $PdfFilesDir "
This is taking about 1 hour to convert 2000 files. I have about 1 million such
files so it means it will take 500 hours. Can we have some quicker solution to
convert the TXT files to PDF in less time.
Thanks
Bash