[ 
https://issues.apache.org/jira/browse/PDFBOX-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler reassigned PDFBOX-2270:
------------------------------------------

    Assignee: Andreas Lehmkühler

> PDField.getFullyQualifiedName() returns name adding suffix '.null'
> ------------------------------------------------------------------
>
>                 Key: PDFBOX-2270
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2270
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 1.5.0, 1.7.1, 1.8.0, 1.8.6
>         Environment: JSE1.6
>            Reporter: Javier García Sánchez
>            Assignee: Andreas Lehmkühler
>              Labels: PDAcroForm, PDField, getFullyQualifiedName()
>         Attachments: TesterFields.java, business_loan_app_1_signer.pdf
>
>   Original Estimate: 120h
>  Remaining Estimate: 120h
>
> We have several pdf files where each one contains one pdf form with their own 
> fields. We need to read all pdf fields and list them into a txt file.
>  
> The problem comes when a pdf form has duplicated field names, so the 
> field.getFullyQualifiedName() returns the name of the field wrong, adding 
> '.null' at the final of field's names.
> -->Situation:
> 1) PDf file containing a pdf form
> 2) The pdf form contains lot of fields, some of their field's names are 
> duplicated, like for example 'Applicant.city'.
> 3) When I try to list all of field's names, duplicate field's names comes 
> with a suffix '.null' --> this only happends on duplicated field's names.
> ----------------------------------------------------------------------------------------------
> -->Example:
> 1) PDF Form with 4 fields whos names are: 'Applicant.name', 
> 'Applicant.phone', 'Applicant.ssn', 'Applicant.name'.
> 2)After running the code shown bellow, the result list is: 
> 'Applicant.name.null', 'Applicant.phone', 'Applicant.ssn', 
> 'Applicant.name.null'.
> ----------------------------------------------------------------------------------------------
> -->Attach the code for listing all pdf form field's names:
>  public static Set<String> printFields( PDDocument doc ) throws IOException {
>         PDDocumentCatalog docCatalog = doc.getDocumentCatalog();
>         PDAcroForm acroForm = docCatalog.getAcroForm();
>         List fields = acroForm.getFields();
>         Iterator fieldsIter = fields.iterator();
>         
>         Set<String> fieldSet = new HashSet<String>();
>         
>         while ( fieldsIter.hasNext() ){
>             PDField field = (PDField)fieldsIter.next();
>             // String fieldFullName = processField(field);
>             fieldSet.addAll( processField( field ) );
>         }
>         
>         return fieldSet;
>     }
>     
>     
>     private static Set<String> processField( PDField field ) throws 
> IOException {
>         List kids = field.getKids();
>         Set<String> result = new HashSet<String>();
>         if( kids != null ){
>             Iterator kidsIter = kids.iterator();
>             
>             while ( kidsIter.hasNext() ){
>                 Object pdfObj = kidsIter.next();
>                 if( pdfObj instanceof PDField ){
>                     PDField kid = (PDField)pdfObj;
>                     result.addAll( processField( kid ) );
>                 }
>             }
>         }else{
>             
>             //System.out.println( "field.getFullyQualifiedName(): " + 
> field.getFullyQualifiedName() );
>             
>             result.add( field.getFullyQualifiedName() );
>         }
>         
>         return result;
>         
>     }
> --------------------------------------------------------------------------------
> field.getFullyQualifiedName()  is returning duplicated field's names with a 
> prefix '.null'.
> Thanks in advance.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to