[ 
https://issues.apache.org/jira/browse/LUCENE-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563811#comment-13563811
 ] 

Michael McCandless commented on LUCENE-4724:
--------------------------------------------

OK I agree: let's disallow empty string at indexing time.

I think this means the CP ctor that takes String... varargs should throw an 
exception if any component is the empty string?

Not sure what (if anything?) to do about indices "out there" that already have 
empty string ... I'm not sure these ever causes problems except to 
PrintTaxonomyStats ... so I could just add some robustness to that one tool.

However, I don't really like being "tolerant" to trailing delimiter, multiple 
delimiters in a row, etc. (like filesystems are): I would prefer that we are 
strict and accept only one form.  That ambiguity can only cause 
problems/confusion.
                
> TaxonomyReader drops empty string component from CategoryPath
> -------------------------------------------------------------
>
>                 Key: LUCENE-4724
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4724
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/facet
>            Reporter: Michael McCandless
>             Fix For: 4.2, 5.0
>
>         Attachments: LUCENE-4724.patch, LUCENE-4724.patch
>
>
> I ran the new PrintTaxonomyStats on a Wikipedia facets index, and it hit an 
> AIOOBE because there was a child of the /categories path that had only one 
> component ... this was created because I had added new 
> CategoryPath("categories", "") during indexing.
> I think TaxoReader should preserve and return that empty string from .getPath?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to