[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258260#comment-14258260 ] Shalin Shekhar Mangar commented on LUCENE-3312: --- This never landed on branch_5x. Can someone help me understand why? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Trunk Attachments: LUCENE-3312-DocumentIterators-uwe.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258300#comment-14258300 ] Uwe Schindler commented on LUCENE-3312: --- The problem is currently some API problems with DocValues and StoredFields. The current API makes then somehwo the same and Robert and I are not happy with that. Because the features around DocValues should now be stabilized, we can look into this again. But to do this, I have to first review what currently in trunk... Maybe a task for post-XMas :-) Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Trunk Attachments: LUCENE-3312-DocumentIterators-uwe.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258304#comment-14258304 ] Robert Muir commented on LUCENE-3312: - that is the most confusing thing, but there are other confusing things. * why does StorableField have a readerValue() method? * why does StorableField have IndexableFieldType? Currently, i dont understand the benefits of the trunk api. it does not seem to allow any more flexibility than branch_5x, just a ton of abstractions? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Trunk Attachments: LUCENE-3312-DocumentIterators-uwe.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258353#comment-14258353 ] Uwe Schindler commented on LUCENE-3312: --- bq. Currently, i dont understand the benefits of the trunk api The main advantage is the fact that IndexReader.document() for stored fields does not return something that can be indexed directly using IndexWriter.addDocument(), preventing a common trap of people thinging that they can update a document by first fetching it from index, modifying it and indexing it back. Unfortunately the current implementation is a bit confusing, but I still want to go that route. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Trunk Attachments: LUCENE-3312-DocumentIterators-uwe.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14258432#comment-14258432 ] Robert Muir commented on LUCENE-3312: - {quote} The main advantage is the fact that IndexReader.document() for stored fields does not return something that can be indexed directly using IndexWriter.addDocument(), preventing a common trap of people thinking that they can update a document by first fetching it from index, modifying it and indexing it back. {quote} But this premise does not work today still, e.g. because docvalues are treated as part of stored fields. so the trap remains. {quote} Unfortunately the current implementation is a bit confusing, but I still want to go that route. {quote} Alternatively, we could just do other work to make that workflow work instead? Users want to do it, why not let them? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Trunk Attachments: LUCENE-3312-DocumentIterators-uwe.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451574#comment-13451574 ] Uwe Schindler commented on LUCENE-3312: --- s/Uwe/Nikola/; Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: 5.0 Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451579#comment-13451579 ] Chris Male commented on LUCENE-3312: David, just at a guess I imagine the branch used in this issue was created before we changed createIndexableFields to not handle storing. To satisfy the conditions at the time (indexing and storing) Nikola changed it to return Field. Lets just fix it and we'll be fine. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: 5.0 Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13451238#comment-13451238 ] David Smiley commented on LUCENE-3312: -- Uwe, can you please explain why you changed SpatialStrategy.createIndexableFields to return Field[] instead of IndexableField[]? As its name suggests and as the javadocs go to some lengths to clarify, createIndexableFields is for indexed data and not storing it. Field implements StorableField. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: 5.0 Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446923#comment-13446923 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, I am about to reintegrate the branch back o Lucene trunk. We need a new entry for MIGRATE.txt. Can you prepare one, so users of the current Lucene 4.0 API can migrate to the new one? It should give some hints what needs to be changed in the code to make a Lucene 4.0 APP ready for Lucene trunk (5.0)? The migrate.txt is formatted using Markdown syntax, so mostly text-only. I will in all cases commit the reintragrated branch, but I want to add a short guide about the changes at a later stage. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446939#comment-13446939 ] Nikola Tankovic commented on LUCENE-3312: - Hi Uwe, it would be most helpful if I could see some similar MIGRATE.txt file from previous migrations to see the level of detail, but if it's a hassle I'll probably manage something without it. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446944#comment-13446944 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, just download the 4.0-BETA release of Lucene. There is a MIGRATE.txt (and corresponding Markdown-generated HTML in the docs folder): http://lucene.apache.org/core/4_0_0-BETA/index.html - http://lucene.apache.org/core/4_0_0-BETA/MIGRATE.html, the source code is here: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/lucene/MIGRATE.txt Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446962#comment-13446962 ] Chris Male commented on LUCENE-3312: Thanks Uwe and Nikola! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: 5.0 Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13446966#comment-13446966 ] Uwe Schindler commented on LUCENE-3312: --- I opened LUCENE-4347 as container issue for later changes. Nikola, please attach MIGRATE.txt changes to LUCENE-4348, as patches againt Lucene trunk! Thanks. Finally: My thanks also goes to Nikola and Chris for the work on this issue. I want to also mention Robert and Mike for helpful comments. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: 5.0 Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445842#comment-13445842 ] Chris Male commented on LUCENE-3312: I've thought about this a little bit. {quote} To me storing needs no 'type' information at all: But I guess the problem with that is that we need DocValues types since DocValues are stored fields here. {quote} We've gone back and forwards about this a lot since the Fields cleanup began but it would be nice to actually have the DocValues Types on the StorableField itself rather than on StorableFieldType. In the end the type is related to the type of the value itself, not disconnected metadata. Having it this way would also alleviate the need for StorableFieldType and make storing values as simple as possible. {quote} This basically is the same problem all over again. * You make a Document with N StorableFields * You call IR.document and get a StorableDocument back, with N-3 StorableFields. * You wonder: what happened to the other 3 fields? They were DocValues. {quote} What if they were returned? Because you're absolutely right, it seems odd for DocValues Fields to be StorableFields and then not accessible like all other StorableFields. So what if we changed how IR.document worked so you could pull DocValues Fields too. Is that something users might want? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445853#comment-13445853 ] Robert Muir commented on LUCENE-3312: - This is not really a viable option. its n random seeks to retreive n dv fields for a doc. They are not stored fields :) Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445852#comment-13445852 ] Uwe Schindler commented on LUCENE-3312: --- bq. What if they were returned? Because you're absolutely right, it seems odd for DocValues Fields to be StorableFields and then not accessible like all other StorableFields. So what if we changed how IR.document worked so you could pull DocValues Fields too. Is that something users might want? This could be a large overhead if e.g. the loading of the whole column would be triggered automatically (depends on configuration). Also, IndexReader.document() is in the basic IndexReader class (because stored fields can always be returned, also for composite readers). DocValues is AtomicReader only... This could of course be managed by BaseCompositeReader to use the subindex function to get the correct document, but it is somehow not the thing docvalues are made for. They are there for using them while scoring, filtering, functions... Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445854#comment-13445854 ] Uwe Schindler commented on LUCENE-3312: --- Yeah right! Every value is a seek. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445383#comment-13445383 ] Uwe Schindler commented on LUCENE-3312: --- I merged in the recent changes in trunk (rev. 1379200). Robert Muir added lots of JavaDocs to the document and index package, so we should check that everything is still correct. We should especially review sentences that contain hints to stored documents on IndexableDocument and vice versa. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445462#comment-13445462 ] Robert Muir commented on LUCENE-3312: - {code} * If you also need to store the value, you should add a * separate {@link StoredField} instance. ... * */ public class ByteDocValuesField extends StoredField { {code} I opened an issue for this already (LUCENE-4331), but here its really confusing since all thse DocValuesField themselves extend StoredField. I think we need to figure this out. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445481#comment-13445481 ] Robert Muir commented on LUCENE-3312: - I echo Chris on the confusion of StorableField requires IndexableFieldType (since it extends GeneralField). To me storing needs no 'type' information at all: But I guess the problem with that is that we need DocValues types since DocValues are stored fields here. But I think this is related to my comment above: I think its confusing that DocValues fields are treated as Stored fields at all? This basically is the same problem all over again. * You make a Document with N StorableFields * You call IR.document and get a StorableDocument back, with N-3 StorableFields. * You wonder: what happened to the other 3 fields? They were DocValues. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445484#comment-13445484 ] Robert Muir commented on LUCENE-3312: - and some of my javadocs warnings on DocumentStoredFieldVisitor in trunk, that were removed during merging should be added back here again as long as StorableField still has IndexableFieldType: {noformat} * @return Document populated with stored fields. Note that only * the stored information in the field instances is valid, * data such as indexing options, term vector options, * etc is not set. {noformat} Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445503#comment-13445503 ] Robert Muir commented on LUCENE-3312: - By the way, these werent meant to be objections to the issue (just random thoughts while reviewing javadocs). Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13445669#comment-13445669 ] Uwe Schindler commented on LUCENE-3312: --- bq. and some of my javadocs warnings on DocumentStoredFieldVisitor in trunk, sorry that was only this one, I did it because at the time of merging the fact that it still implements IndexableField. But this is the same issue like Chris complained about. We should cover that in a second step. Robert can you commit your javadoc comments back in or should I do it? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: LUCENE-3312-DocumentIterators-uwe.patch, lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch, lucene-3312-patch-14.patch, LUCENE-3312-reintegration.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Uwe Schindler commented on LUCENE-3312 Break out StorableField from IndexableField Hi Nikola, thanks for your work on this GSoC project. The Jenkins job seems to pass, we should now work on reintegrating the branch into trunk. Here my questions to the other committers: Apply only to trunk (5.0) - so it has more time to bake? I think this change would be too big for Lucene 4.0 - and too late?? Are there any other things to change? One open point is StorableFieldType. I would like to integrate it asap, as it gets out of date quite early. I will do a merge from trunk - branch to keep up-to-date. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Uwe Schindler commented on LUCENE-3312 Break out StorableField from IndexableField Merged up to trunk rev 1377246. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Chris Male commented on LUCENE-3312 Break out StorableField from IndexableField Apply only to trunk (5.0) - so it has more time to bake? I think this change would be too big for Lucene 4.0 - and too late?? +1 to 5.0 only. It's another big change to the Document/Field API that we may want to evolve more as it bakes and earlier adopters begin to use it. Are there any other things to change? One open point is StorableFieldType. StorableFieldType seems like the only thing at this stage that needs to be addressed. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Uwe Schindler commented on LUCENE-3312 Break out StorableField from IndexableField I have one comment about the following methods in Document: + /** Obtains all indexed fields in document */ + @Override + public Iterable? extends IndexableField indexableFields() { +IteratorField it = indexedFieldsIterator(); + +ListIndexableField result = new ArrayListIndexableField(); +while(it.hasNext()) { + result.add(it.next()); +} + +return result; + } + + + /** Obtains all stored fields in document. */ + @Override + public Iterable? extends StorableField storableFields() { +IteratorField it = storedFieldsIterator(); + +ListStorableField result = new ArrayListStorableField(); +while(it.hasNext()) { + result.add(it.next()); +} + +return result; + } + In my opinion, this should not copy to an ArrayList, it shoudl simply return a anonymous Iterable.. wrapping the iterator: public Iterable? extends StorableField storableFields() { return new Iterable? extends StorableField() { @Override Iterator? extends StorableField iterator() { return Document.this.storedFieldsIterator(); } } } Also it may not be needed to have ? extends Foo a simple Foo is enough here (comment from Generics Policman) This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
Chris Male commented on LUCENE-3312 Break out StorableField from IndexableField +1 This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437489#comment-13437489 ] Uwe Schindler commented on LUCENE-3312: --- Merged up to trunk revision: 1374718 Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437575#comment-13437575 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, afetr the branch merge, the more picky javadocs checker in Lucene Core found few classes without Javadoc at all. It would be good to add Javadocs for the new StorableField, StorableFieldType, GeneralField,... classes. Also please make sure that the ASF License header does not start with /** (which is javadoc), but starts with /* (simple comment). I fixed the ones I found. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436713#comment-13436713 ] Uwe Schindler commented on LUCENE-3312: --- How should we proceed with this? I think the IndexableFieldType vs StorableFieldType situation is not yet decided. How should we proceed? We have to stop working next monday and prepare the final GSoC evaluation. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436755#comment-13436755 ] Chris Male commented on LUCENE-3312: We definitely need to clean up StorableFieldType situation, but I think we can tackle that afterwards. I think it's best to ensure what we have now works and we're comfortable with the API. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436142#comment-13436142 ] Uwe Schindler commented on LUCENE-3312: --- OK, applied and committed the patch, rev 1373940. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch, lucene-3312-patch-13.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434968#comment-13434968 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, we should now use the remaining time to do some cleanup and prepare the branch for merging with Lucene trunk. There will be no backport, so this would be the first Lucene 5.x only change, do we all agree with this? I think the change would be too heavy to go into 4.0. The final pencils down date would be monday next week, the evaluations of GSoC until Thursday, noon UTC next week, so we should hurry up. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434970#comment-13434970 ] Chris Male commented on LUCENE-3312: Is it going to be possible to address IndexableFieldType vs StorableFieldType situation resolved before this lands? I can assist if that would help. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434972#comment-13434972 ] Uwe Schindler commented on LUCENE-3312: --- Hi, I merged up to trunk (to get jenkins config changes in): revision 1373337 I also created a Jenkins Job on the Policeman build server: http://jenkins.sd-datasolutions.de/job/lucene3312-branch/ It will send mails to Nikola and myself on failures. It would be good to adress them asap. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13434996#comment-13434996 ] Uwe Schindler commented on LUCENE-3312: --- First build succeeded: http://jenkins.sd-datasolutions.de/job/lucene3312-branch/1/consoleFull Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435324#comment-13435324 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, the first build was not done using Oracle JDK 6, so Javadocs were not built. The recent one failed, because of invalid Javadocs. Could you send a patch with those corrected? I would recommend to run ant javadocs or ant javadocs-lint) (more thorough) from top-level. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433093#comment-13433093 ] Uwe Schindler commented on LUCENE-3312: --- Your patch is also missing the replacement file addition. I tried to do it manually, but i have no patch. Can you use SVN 1.7.x and use --show-copies-as-adds (this does not work with SVN 1.6)? This simplifies patches a lot! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433150#comment-13433150 ] Uwe Schindler commented on LUCENE-3312: --- Applied patch to branch in revision 1372427. Now merging trunk in... Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13433161#comment-13433161 ] Uwe Schindler commented on LUCENE-3312: --- Merged up to trunk revision: 1372438 There was one conflict in some TermVectors test, but they now pass. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch, lucene-3312-patch-12a.patch, lucene-3312-patch-12.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431712#comment-13431712 ] Uwe Schindler commented on LUCENE-3312: --- Patch applied revision: 1371131 Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431717#comment-13431717 ] Uwe Schindler commented on LUCENE-3312: --- I merged the branch up to current trunk (revision: 1371142) By this merge, new compile failures in tests occur, mainly caused by new tests added in some commits, using the old API. It would be good to fix those. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431719#comment-13431719 ] Chris Male commented on LUCENE-3312: Hey Nikola, bq. except for mentioned TestQualityRun.testTrecQuality. I'm happy to help work out what is going wrong here, have you done any debugging of the test yourself? What have you worked out so far? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431728#comment-13431728 ] Nikola Tankovic commented on LUCENE-3312: - Well some stats in this test are hurt: {code} Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431807#comment-13431807 ] Robert Muir commented on LUCENE-3312: - Something is going wrong with the indexing of the reuters content. I ran the test with SimpleText on both branches (adding forceMerge(1) for simplicity) and looked at the resulting index: Trunk: {noformat} -rw-rw-r-- 1 rmuir rmuir 13798 Aug 9 09:42 _0_6.len -rw-rw-r-- 1 rmuir rmuir 1022509 Aug 9 09:42 _0.fld -rw-rw-r-- 1 rmuir rmuir1310 Aug 9 09:42 _0.inf -rw-rw-r-- 1 rmuir rmuir 3345582 Aug 9 09:42 _0.pst -rw-rw-r-- 1 rmuir rmuir 513 Aug 9 09:42 _0.si -rw-rw-r-- 1 rmuir rmuir 71 Aug 9 09:42 segments_1 -rw-rw-r-- 1 rmuir rmuir 20 Aug 9 09:42 segments.gen {noformat} Branch: {noformat} -rw-rw-r-- 1 rmuir rmuir 13262 Aug 9 09:46 _4_6.len -rw-rw-r-- 1 rmuir rmuir 290247032 Aug 9 09:45 _4.fld -rw-rw-r-- 1 rmuir rmuir 1310 Aug 9 09:46 _4.inf -rw-rw-r-- 1 rmuir rmuir 459164224 Aug 9 09:46 _4.pst -rw-rw-r-- 1 rmuir rmuir 593 Aug 9 09:46 _4.si -rw-rw-r-- 1 rmuir rmuir71 Aug 9 09:46 segments_1 -rw-rw-r-- 1 rmuir rmuir20 Aug 9 09:46 segments.gen {noformat} Looking into the .fld file, I think the problem is obvious: on trunk: {noformat} doc 0 numfields 5 doc 1 numfields 5 doc 2 numfields 5 {noformat} on branch: {noformat} doc 0 numfields 5 doc 1 numfields 10 doc 2 numfields 15 {noformat} So there is some bug, where a field is 'accumulating' across documents. The last document has 2890. I'm really horrified this is the only test that fails! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431828#comment-13431828 ] Chris Male commented on LUCENE-3312: Wow, I have replicated the same behaviour. On the branch the number of fields per doc is... wow. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431833#comment-13431833 ] Chris Male commented on LUCENE-3312: Ah I think I found the problem, it's in Document, I'll verify in a few seconds. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431836#comment-13431836 ] Robert Muir commented on LUCENE-3312: - nice: the bug wasn't obvious to me (i glanced thru the diff of the branches), but at least SimpleText came to the rescue :) I'm still really really shocked more tests aren't failing for this: I guess maybe it only happens in certain circumstances? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431838#comment-13431838 ] Nikola Tankovic commented on LUCENE-3312: - I found the problem, will report in a minute with solution! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431839#comment-13431839 ] Chris Male commented on LUCENE-3312: Yup found it. The problem is in the branch {{Document#getFields()}} is creating a new List and inside {{DocMaker}} in the benchmark module, it is pulling the Fields and clearing them (using {{clear()}}). Since a new List is being created each time, it is the new List that is getting cleared rather than the actual fields. Hence each iteration just adds more fields without having the previous ones cleared. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431841#comment-13431841 ] Chris Male commented on LUCENE-3312: Nikola, we should probably move all of Document's methods over to just working with Field (and not IndexableField). I don't mind if we want to make getFields() return an immutable list but we then need to provide a clear() method so people can reuse Document instances. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431843#comment-13431843 ] Robert Muir commented on LUCENE-3312: - Can we not return a new list? I don't think we should just work around the problem in DocMaker. This would be a serious sneaky bug to introduce to apps that do this. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431845#comment-13431845 ] Robert Muir commented on LUCENE-3312: - {quote} I don't mind if we want to make getFields() return an immutable list {quote} Thats an ok solution too, so someone would get exception if they do this? Then they would use Document.clear() or whatever else instead? (we should make sure they can still remove things or whatever, just safely). Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431848#comment-13431848 ] Nikola Tankovic commented on LUCENE-3312: - Yes, that is the problem. clear() meathod was clearing not the fields of Document but a copy. Should I go with immutable list, and Document.clear()? Document.getFields().clear() doesn't sound right... Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431849#comment-13431849 ] Chris Male commented on LUCENE-3312: Yeah we definitely shouldn't return a new list. I think the immutable list and Document.clear() combo will suffice. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431851#comment-13431851 ] Chris Male commented on LUCENE-3312: Oh we should also include a unit test that verifies this behaviour. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431852#comment-13431852 ] Robert Muir commented on LUCENE-3312: - +1, nobody should have to debug TestQualityRun :) Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431858#comment-13431858 ] Nikola Tankovic commented on LUCENE-3312: - OK, will do that, and also a Document.clear() test. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431867#comment-13431867 ] Chris Male commented on LUCENE-3312: Nikola, On a totally note totally unrelated to the bug, I noticed that StorableField still returns an IndexableFieldType for type(). This lead me to GeneralField. I don't think we need this. IndexableField should only need name(), tokenStream() and type(). StorableField needs name(), type() and the various xyzValue() accessors. Its type() should be a StorableFieldType and some of the functionality from IndexableFieldType should go there. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431874#comment-13431874 ] Nikola Tankovic commented on LUCENE-3312: - I'm sure someone knows a better way to create immutable List than this: {code} public final ListIndexableField getFields() { ListIndexableField result = new ArrayListIndexableField(); for (IndexableField field : fields) { result.add(field); } return Arrays.asList(result.toArray(new IndexableField[result.size()])); } {code} Any pointers? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431880#comment-13431880 ] Chris Male commented on LUCENE-3312: {code} public final ListField getFields() { return Collections.unmodifiableList(fields); } {code} Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431884#comment-13431884 ] Nikola Tankovic commented on LUCENE-3312: - Chris, I tried to go with StorableFieldType but I ended with a whole lot of mess, after this fix I'll try that again and report if I find problems! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432041#comment-13432041 ] Michael McCandless commented on LUCENE-3312: Branch + current patch looks great! I just found some minor things: IndexDocument, Document methods (eg the new clear()), GeneralField need javadocs. Import statements should be under the copyright header (eg StoredDocument.java, StorableField.java, GeneralField.java, StoredFieldsWriter.java)? Silly IDEs... Emacs does this correctly ;) Document's add(IndexableField) and add(StorableField) seem dangerous because they secretly cast to oal.document.Field? Ie, I cannot use Document to hold my private Storable/IndexableField implementations. I think we should remove them, leaving only add(Field)? I think StoredDocument should be in oal.index not oal.document? Ie, because it's something you've retrieved from the IndexReader. Also, it will cause confusion with oal.document.Document which is the obvious class you should use to hold all your indexed/stored fields. Why does StoredDocument still have removeField/s? Shouldn't it be read-only? (I feel like a broken record). Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13432182#comment-13432182 ] Uwe Schindler commented on LUCENE-3312: --- I applied patch to current branch, but it does not compile anymore. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch, lucene-3312-patch-10.patch, lucene-3312-patch-11.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431240#comment-13431240 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, next week is pencils down, so we should start to finish this task and do final things like scrub code, write tests, improve documentation (official google description). Did you find out whats causing your test failures? I may try to look into it this evening, so I will try to find out. Should I merge up to trunk? The final week until Fri, 24th should be used to prepare the final branch reintegrate and provide the final patch (that could also be sent to Google). Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13431433#comment-13431433 ] Uwe Schindler commented on LUCENE-3312: --- bq. Some tests throw OutOfMemory errors (but that was also last year), so I think this is one final test to fix. They should not do this when ran with ant test. If they do in eclipse or other GUIs it can happen because the default test -Xmx is 512M for Lucene's build.xml, which is not respected by Eclipse. I will apply the patch later! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch, lucene-3312-patch-09.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424319#comment-13424319 ] Uwe Schindler commented on LUCENE-3312: --- Thanks Nikola, I applied your patch in revision: 1366638 The merging is now running. bq. Lucene tests pass except for 'TestQualityRun' (I'm struggling with this one) I think Robert Muir may be able to help you, he already had some comments about this. I will ping him! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424320#comment-13424320 ] Uwe Schindler commented on LUCENE-3312: --- Merged up to trunk in revision: 1366643 Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch, lucene-3312-patch-08.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423853#comment-13423853 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, should I merge up the branch to trunk? Do you have anything to commit? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419831#comment-13419831 ] Nikola Tankovic commented on LUCENE-3312: - I keep getting this error: {code} [junit4:junit4] Suite: org.apache.lucene.benchmark.quality.TestQualityRun [junit4:junit4] FAILURE 27.4s | TestQualityRun.testTrecQuality [junit4:junit4] Throwable #1: java.lang.AssertionError: avg-p should be perfect: 0.9856606205097583 expected:1.0 but was:0.9856606205097583 {code} Should I worry? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419832#comment-13419832 ] Robert Muir commented on LUCENE-3312: - Yes. There aren't many tests that test that Lucene's default ranking is correct, but this is one of them. This means something is wrong... Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414097#comment-13414097 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, the second half of the GSoC started now. What are the plans for the second part? I expected that the Solr+Test changes proceed now, so we have enough time check the new API in general use and fix issues! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414100#comment-13414100 ] Nikola Tankovic commented on LUCENE-3312: - Hi Uwe, in a few days I will hopefully finish with Solr+Test part. Then we can do another round of discussion, API checking and modifying. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414102#comment-13414102 ] Uwe Schindler commented on LUCENE-3312: --- Is there anything at the moment you need help with? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13414107#comment-13414107 ] Nikola Tankovic commented on LUCENE-3312: - Not at the moment, just little more time :) A lot of code here :) Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411355#comment-13411355 ] Uwe Schindler commented on LUCENE-3312: --- Nikola: Did you get the test running now? Otherwise I see no problems with the code at the moment, but I will wait for other comment contributions! I still have a question to the iterator again: {code:java} public abstract class FilterIteratorT, U extends T implements IteratorT { {code} This seems strange U extends T, so the iterator returns a wider type than it was in the original. I would expect it to be the other way round. In general for this FilteredIterator I would make no generics magic and let it return the same as the delegate. If the predicate changes type, then this should be done by the caller (who provides the predicate). Do I miss something? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13411920#comment-13411920 ] Nikola Tankovic commented on LUCENE-3312: - Hi Uwe, I have the test fixed, and both Lucene and Solr are successfully compiling now :) You are right about the FilterIterator, {code} public abstract class FilterIteratorT implements IteratorT { {code} is enough.. I am now fixing test on rest of Lucene and Solr. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409375#comment-13409375 ] Chris Male commented on LUCENE-3312: bq. Hmm ... if the test is running fine today, not storing the id field, then why would it need to start storing it on switching to returning StoredDocument from IR.document...? In theory this should be a rote change? Good point bq. But we still seem to have StoredDocument.removeField/s methods? Shouldn't that class be read-only? +1 Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409402#comment-13409402 ] Chris Male commented on LUCENE-3312: bq. Regarding 'deleteField' in 'StoredDocument', I cannot remove it 'cause of PersistentSnapshotDeletionPolicy::readSnapshotsInfo function, I guess it needs refactoring. All that code seems to be doing is removing a specific field from the Document and then iterating over the remaining values in the Document. It seems an easy change to just skip the field during the for loop. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409419#comment-13409419 ] Uwe Schindler commented on LUCENE-3312: --- I applied the patch to the branch: At revision: 1359139 I will now do svn merge to keep branch up-to-date! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409432#comment-13409432 ] Uwe Schindler commented on LUCENE-3312: --- Done, no new compile failures! At revision: 1359151 I get 2 unneeded cast warnings in lucene-core, that's all. Lucene Core Tests pass, did you disable the failing tests? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409461#comment-13409461 ] Nikola Tankovic commented on LUCENE-3312: - Hi Uwe, how can I see those warnings? Thank you! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409464#comment-13409464 ] Uwe Schindler commented on LUCENE-3312: --- ant compile on command line. Before sending the patch, you should *always* build test on command line with ant! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409476#comment-13409476 ] Nikola Tankovic commented on LUCENE-3312: - Yes, of course :) Mike taught me that. The problem is I don't see those warning with ant compile. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409481#comment-13409481 ] Uwe Schindler commented on LUCENE-3312: --- Maybe you missed to clean before? ANT's javac only compiles files not yet compiled, so warnings only show on first time? {noformat} C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\coreant clean compile Buildfile: C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\core\build.xml clean: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: ivy-availability-check: ivy-fail: ivy-configure: [ivy:configure] :: Ivy 2.2.0 - 20100923230623 :: http://ant.apache.org/ivy/ :: [ivy:configure] :: loading settings :: file = C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\ivy-settings.xml resolve: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\build\core\classes\java [javac] Compiling 634 source files to C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\build\core\classes\java [javac] C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\core\src\java\org\apache\lucene\index\DocFieldProcessor.java:24 3: warning: [cast] redundant cast to org.apache.lucene.index.StorableField [javac] consumer.add(docState.docID, (StorableField) field); [javac]^ [javac] C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\core\src\java\org\apache\lucene\index\NormsConsumerPerField.jav a:57: warning: [cast] redundant cast to org.apache.lucene.index.StorableField [javac] consumer.add(docState.docID, (StorableField) field); [javac] ^ [javac] 2 warnings [copy] Copying 2 files to C:\Users\Uwe Schindler\Projects\lucene\lucene3312\lucene\build\core\classes\java compile-core: compile: BUILD SUCCESSFUL Total time: 13 seconds {noformat} Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409484#comment-13409484 ] Nikola Tankovic commented on LUCENE-3312: - Sorry for my n00b-iness. Thank you! Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13409641#comment-13409641 ] Uwe Schindler commented on LUCENE-3312: --- I merged trunk again (because of LUCENE-4199): revision: 1359283 Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch, lucene-3312-patch-07.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408973#comment-13408973 ] Michael McCandless commented on LUCENE-3312: Hmm ... if the test is running fine today, not storing the id field, then why would it need to start storing it on switching to returning StoredDocument from IR.document...? In theory this should be a rote change? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408997#comment-13408997 ] Michael McCandless commented on LUCENE-3312: Branch looks good! But we still seem to have StoredDocument.removeField/s methods? Shouldn't that class be read-only? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408611#comment-13408611 ] Nikola Tankovic commented on LUCENE-3312: - Hi, I'm going over tests and so far so good. I am seeing a lot of id fields in document that aren't stored, e.g.: {code} doc.add(new IntField(id, docCount, Field.Store.YES)); {code} ... and having errors because my returned StoredDocument only shows stored fields. Can I convert these id fields into stored in tests? Or did we do something wrong with StoredDocument? Nikola Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408626#comment-13408626 ] Chris Male commented on LUCENE-3312: If the test is wanting to retrieve the ID field for a StoredDocument then yes the field will need to be stored. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407894#comment-13407894 ] Nikola Tankovic commented on LUCENE-3312: - Uwe, I apologize for inconveniences. I don't use AbstractIterator from Google any more, and I did put custom implementation in util, but I forgot to remove import declaration. I certainly will configure my IDE to not change unrelated code (apologies once again). I switched to your new branch, and will work on it from now. The big question is: can I go to fixing solr and tests or do you think there is some major API change left to do? Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407897#comment-13407897 ] Chris Male commented on LUCENE-3312: My feeling at least is that we should definitely get going on Solr and tests since they are good ways to see if the API can be consumed. A failing test might reveal something we haven't considered. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407898#comment-13407898 ] Uwe Schindler commented on LUCENE-3312: --- Fine, thanks. I had no time to do API wise checks, maybe Chris had a closer look. Great work in all cases! :-) Uwe. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407509#comment-13407509 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, I merged your patch to current tunk (removed lots of throws CorruptIndexException) and undid some formatti ng Changes in IndexReader. Can you configure your IDE to *not* change unrelated code? This makes merging extremely hard. I committed this in r1357938 to a new branch https://svn.apache.org/repos/asf/lucene/dev/branches/lucene3312 (from trunk r1357904, Lucene 5.0). Please use this branch for further work! I will merge it regularily when bigger changes are in main dev, so please keep it updated. Please provide new patches against this branch. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407524#comment-13407524 ] Uwe Schindler commented on LUCENE-3312: --- And as noted before: In Lucene core we dont use extrenal dependencies, so we have a compile failure because of AbstractIterator from Google Collect, we have to put this one into Lucene utils. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404449#comment-13404449 ] Uwe Schindler commented on LUCENE-3312: --- Hi Nikola, I will think about the core API and give my comments later. As changing tests and solr is really the biggest change, we should create a branch and do it step for step. I would commit the current patch to a branched trunk (5.0) and then you can work with a new checkout from there and I will commit the later steps. This also allows heavy™ commiting™ by other committers. Unfortunately I cannot give you commit access to Apache's SVN. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404450#comment-13404450 ] Nikola Tankovic commented on LUCENE-3312: - Agreed! No problem about commit access, I'll send patches :) Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch, lucene-3312-patch-06.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3312) Break out StorableField from IndexableField
[ https://issues.apache.org/jira/browse/LUCENE-3312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13398324#comment-13398324 ] Chris Male commented on LUCENE-3312: Hey Nikola, Just did a quick pass over the patch. I have an alternative way to do the Indexable/StorableFieldsIterator in Document (it'll need the policeman's tick though): {code} public abstract class SelectiveIteratorT implements IteratorT { private T next; private final ListT list; private int pos; public SelectiveIterator(ListT list) { this.list = list; } @Override public void remove() { throw new UnsupportedOperationException(); } @Override public boolean hasNext() { for (; pos list.size(); pos++) { T t = list.get(pos); if (isNext(t)) { next = t; return true; } } return false; } @Override public T next() { return next; } abstract boolean isNext(T t); } {code} I think that'll work. Then you can just create two instances which implement {{isNext}} differently. I also noticed that you've included {{import org.apache.commons.lang.NotImplementedException;}} in Document which will also need to be removed. Break out StorableField from IndexableField --- Key: LUCENE-3312 URL: https://issues.apache.org/jira/browse/LUCENE-3312 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Nikola Tankovic Labels: gsoc2012, lucene-gsoc-12 Fix For: Field Type branch Attachments: lucene-3312-patch-01.patch, lucene-3312-patch-02.patch, lucene-3312-patch-03.patch, lucene-3312-patch-04.patch, lucene-3312-patch-05.patch In the field type branch we have strongly decoupled Document/Field/FieldType impl from the indexer, by having only a narrow API (IndexableField) passed to IndexWriter. This frees apps up use their own documents instead of the user-space impls we provide in oal.document. Similarly, with LUCENE-3309, we've done the same thing on the doc/field retrieval side (from IndexReader), with the StoredFieldsVisitor. But, maybe we should break out StorableField from IndexableField, such that when you index a doc you provide two Iterables -- one for the IndexableFields and one for the StorableFields. Either can be null. One downside is possible perf hit for fields that are both indexed stored (ie, we visit them twice, lookup their name in a hash twice, etc.). But the upside is a cleaner separation of concerns in API -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org