[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049014#comment-13049014 ] Simon Willnauer commented on SOLR-2242: --- Bill, this seems like an important issue. Many votes etc. I am on travel right now so give me some days to come back and I will work with you to get this done. Thanks for your patience simon > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > > > 14 > 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111 > > > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned SOLR-2242: - Assignee: Simon Willnauer > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > > > 14 > 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111 > > > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2523) SolrJ QueryResponse doesn't support range facets
[ https://issues.apache.org/jira/browse/SOLR-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049012#comment-13049012 ] Martijn van Groningen commented on SOLR-2523: - Solr-1896 looks like what this issue needs. The change that I suggested involves having the datemath syntax in regular Solr query parser. But I think solr-1896 is just fine for this issue. Maybe changing the regular query parser is a bit too much > SolrJ QueryResponse doesn't support range facets > > > Key: SOLR-2523 > URL: https://issues.apache.org/jira/browse/SOLR-2523 > Project: Solr > Issue Type: Improvement > Components: clients - java >Reporter: Martijn van Groningen >Assignee: Martijn van Groningen >Priority: Trivial > Fix For: 3.3, 4.0 > > Attachments: SOLR-2523.patch, SOLR-2523.patch > > > It is possible to get date facets and pivot facets in SolrJ. > {code:java} > queryResponse.getFacetDate(); > queryResponse.getFacetPivot(); > {code} > Having this also for range fields would be nice. Adding this is trivial. > Maybe we should deprecate date facet methods in QueryResponse class? Since it > is superseded by range facets. Also some set / add / remove methods for > setting facet range parameters on the SolrQuery class would be nice. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Facet path
I think it can be a subtask of LUCENE-3079 and we should first focus on the general faceting features. As far as I know there is no bitset impl. out there for faceting. Op 14 jun. 2011 00:08 schreef "Jason Rutherglen" het volgende: > Martijn, If the title is correct "Post grouping faceting" then maybe > the bit set based system should be a separate issue? Eg, is there a > bit set implementation today in LUCENE-3079? > > On Mon, Jun 13, 2011 at 2:58 PM, Martijn v Groningen > wrote: >> There is already an issue open for this: >> LUCENE-3079 >> >> As the issues describes, the faceting in Solr relies on the schema (and off >> course the UIF). >> So having the noting of a FieldType in the facet module would be very >> helpful for selecting the right facet implementation. >> Currently in Solr there is only one facet method for field facet that work >> per-segment, >> but I think in the end we would want all facet types and methods to work on >> a per-segment basis. >> Martijn >> On 13 June 2011 23:47, Jason Rutherglen wrote: >>> >>> I think it's a better approach than rewriting Solr's internals. Eg, >>> small development steps could be taken, using the knowledge learned >>> from Solr's facet system. Eg, caching and intersecting bit sets would >>> be an easy-ish first step? >>> >>> On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer >>> wrote: >>> > I believe people are already looking into that but I am not sure. >>> > sounds reasonable to me but I think its going to be lots of work >>> > >>> > simon >>> > >>> > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen >>> > wrote: >>> >> Are we going the direction of creating full facet features outside of >>> >> Solr? Eg, we have UIF extrapolated out, we can probably make a module >>> >> for bit set intersections as well. In the process the faceting will >>> >> go per-segment. >>> >> >>> >> - >>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> >> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >> >>> >> >>> > >>> > - >>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> > For additional commands, e-mail: dev-h...@lucene.apache.org >>> > >>> > >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: dev-h...@lucene.apache.org >>> >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org >
Re: commit-check target for ant?
> about, we could have the same argument about requiring 1.7 to compile but > supporting binary releases that run on 1.6; or an argument about wether we > should use a commercial tool that commiters have a license for to "build" > java code from a source grammer -- the point is that as an open source > project i think it's really important that *all* our users be allowed to I think I got lost in this discussion somehow -- users will be able to compile from sources, just not with the 1.5 compiler... But they'd still be able to compile their source code to 1.5 binaries with an open source toolchain. Anyway, to me, the points for 1.6 are: 1) array resizing intrinsics in Arrays.* (at least theoretically, no need for double allocation during array resizing). 2) @Override on interface overriding methods. None of these are critical, but retroweaving back to 1.5 bytecode can provide you a clean way of using both and keep 1.5 users happy. Liked you mentioned -- this probably does come down to the argument of switching to 1.6 as the supported platform; then, 1.5 compatibility backport would be a nice touch for those, who still need it. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2551) Checking dataimport.properties for write access during startup
[ https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049000#comment-13049000 ] C S commented on SOLR-2551: --- I might be wrong but isn't the dataimport.properties written even if no delta-query is configured? So even if you'd just be interested in full imports, you'll run into an exception at the end of your full import when solr attempts to write the timestamp into dataimport.properties. However, if that's not the case (i.e. a non-writable dataimport.properties does not break a full import), i'd suggest that the check if dataimport.properties is writable should only be done if a delta-query is defined, and in this case it should refuse to start the import. > Checking dataimport.properties for write access during startup > -- > > Key: SOLR-2551 > URL: https://issues.apache.org/jira/browse/SOLR-2551 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4.1, 3.1 >Reporter: C S >Assignee: Shalin Shekhar Mangar >Priority: Minor > > A common mistake is that the /conf (respectively the dataimport.properties) > file is not writable for solr. It would be great if that were detected on > starting a dataimport job. > Currently and import might grind away for days and fail if it can't write its > timestamp to the dataimport.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3199) Add non-desctructive sort to BytesRefHash
[ https://issues.apache.org/jira/browse/LUCENE-3199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048987#comment-13048987 ] Jason Rutherglen commented on LUCENE-3199: -- I think the issue with this, as it relates to realtime search, is in order to sort, we'll need to freeze indexing. > Add non-desctructive sort to BytesRefHash > - > > Key: LUCENE-3199 > URL: https://issues.apache.org/jira/browse/LUCENE-3199 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index >Affects Versions: 4.0 >Reporter: Jason Rutherglen >Priority: Minor > > Currently the BytesRefHash is destructive. We can add a method that returns > a non-destructively generated int[]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-trunk - Build # 1594 - Still Failing
Build: https://builds.apache.org/job/Lucene-trunk/1594/ 3 tests failed. FAILED: org.apache.lucene.index.TestPerFieldCodecSupport.testStressPerFieldCodec Error Message: GC overhead limit exceeded Stack Trace: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.lucene.util.fst.FST.getBytesReader(FST.java:849) at org.apache.lucene.util.fst.FST.readFirstRealArc(FST.java:565) at org.apache.lucene.util.fst.NodeHash.hash(NodeHash.java:92) at org.apache.lucene.util.fst.NodeHash.addNew(NodeHash.java:141) at org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:161) at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:126) at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:146) at org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:232) at org.apache.lucene.util.fst.Builder.add(Builder.java:349) at org.apache.lucene.util.fst.Builder.add(Builder.java:262) at org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader$SimpleTextTerms.loadTerms(SimpleTextFieldsReader.java:494) at org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader$SimpleTextTerms.(SimpleTextFieldsReader.java:460) at org.apache.lucene.index.codecs.simpletext.SimpleTextFieldsReader.terms(SimpleTextFieldsReader.java:561) at org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader$FieldsIterator.terms(PerFieldCodecWrapper.java:152) at org.apache.lucene.index.MultiFieldsEnum.terms(MultiFieldsEnum.java:113) at org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:50) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:573) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3461) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3105) at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1873) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1868) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1864) at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1482) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1234) at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1215) at org.apache.lucene.index.TestPerFieldCodecSupport.testStressPerFieldCodec(TestPerFieldCodecSupport.java:306) FAILED: org.apache.lucene.search.TestPhraseQuery.testRandomPhrases Error Message: GC overhead limit exceeded Stack Trace: java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput.reader(FixedIntBlockIndexInput.java:51) at org.apache.lucene.index.codecs.intblock.FixedIntBlockIndexInput.reader(FixedIntBlockIndexInput.java:39) at org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl$SepDocsAndPositionsEnum.(SepPostingsReaderImpl.java:522) at org.apache.lucene.index.codecs.sep.SepPostingsReaderImpl.docsAndPositions(SepPostingsReaderImpl.java:283) at org.apache.lucene.index.codecs.BlockTermsReader$FieldReader$SegmentTermsEnum.docsAndPositions(BlockTermsReader.java:707) at org.apache.lucene.index.MultiTermsEnum.docsAndPositions(MultiTermsEnum.java:388) at org.apache.lucene.index.codecs.TermsConsumer.merge(TermsConsumer.java:92) at org.apache.lucene.index.codecs.FieldsConsumer.merge(FieldsConsumer.java:53) at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:573) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:116) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3461) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3105) at org.apache.lucene.index.SerialMergeScheduler.merge(SerialMergeScheduler.java:37) at org.apache.lucene.index.IndexWriter.maybeMerge(IndexWriter.java:1873) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1683) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1638) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1608) at org.apache.lucene.index.RandomIndexWriter.doRandomOptimize(RandomIndexWriter.java:315) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:328) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:308) at org.apache.lucene.search.TestPhraseQuery.testRandomPhrases(TestPhraseQuery.java:662) FAILED: org.apache.lucene.util.fst.TestFSTs.testBigSet Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: J
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048967#comment-13048967 ] Bill Bell commented on SOLR-2242: - Lance, This patch just takes the # of lines coming out of the facet section for a field and tells you how many you have. It does not do anything to change the facet, or deal with white space, or anything complicated. This is a simple counter. Bill > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > > > 14 > 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111 > > > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048966#comment-13048966 ] Bill Bell commented on SOLR-2242: - Thanks Mike. I think it is committable since shards work now. We might need to fix some broken tests (and I am willing to do that). Then we can move to range and queries... Thanks. > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > > > 14 > 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111 > > > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Description: When returning facet.field= you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called "namedistinct". Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price This currently only works on facet.field. {code} 14 31 {code} Several people use this to get the group.field count (the # of groups). was: When returning facet.field= you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called "namedistinct". Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price Here is an example on field "hgid" (without namedistinct): {code} - - 1 1 1 1 1 5 1 {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - - 7 {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > This currently only works on facet.field. > {code} > > > 14 > 31 name="19.95">111 name="179.99">111 name="329.95">111 name="479.95">111 > > > {code} > Several people use this to get the group.field count (the # of groups). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2242: Description: When returning facet.field= you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called "namedistinct". Here is an example: http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price Here is an example on field "hgid" (without namedistinct): {code} - - 1 1 1 1 1 5 1 {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - - 7 {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! was: When returning facet.field= you will get a list of matches for distinct values. This is normal behavior. This patch tells you how many distinct values you have (# of rows). Use with limit=-1 and mincount=1. The feature is called "namedistinct". Here is an example: http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 Here is an example on field "hgid" (without namedistinct): {code} - - 1 1 1 1 1 5 1 {code} With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows (7), not the number of values (11). {code} - - 7 {code} This works actually really good to get total number of fields for a group.field=hgid. Enjoy! > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=0&facet.limit=-1&facet.field=price > http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=1&facet.limit=-1&facet.field=price > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2554) RandomSortField values are cached in the FieldCache
[ https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048965#comment-13048965 ] Yonik Seeley commented on SOLR-2554: Thanks Vadim, I can reproduce this on 3.1 and branch_3x (but trunk seems to work fine), and I'll look into fixing it tomorrow. > RandomSortField values are cached in the FieldCache > --- > > Key: SOLR-2554 > URL: https://issues.apache.org/jira/browse/SOLR-2554 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.1 >Reporter: Vadim Geshel > > The values of RandomSortField get cached in the FieldCache. When using many > RandomSortFields over time, this leads to running out of memory. > This may be one of the cases already covered in SOLR- but I'm not sure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2242) Get distinct count of names for a facet field
[ https://issues.apache.org/jira/browse/SOLR-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048964#comment-13048964 ] Bill Bell commented on SOLR-2242: - Lance, There is literally 15 lines of code changes. Not sure how you cannot follow it. I could use no memory and just loop through the results, but that would not be cached - so the speed would still be slow since I need to pull in the array in order to count it. The field is not called namedistinct anymore... It is called facet.numFacetTerms=2,1,0. All other parameters are good. Also you do not need anything else to get it to work, since I set the defaults to work for you now. I'll see if I can write some more tests. Here is the rub: I would be happy to wrote hundreds of test cases if I knew someone was going to actually help me get this done. I am used to having a committer actually work with me - Mike McCandless is awesome and we worked on several issues together. But I have seen tons of features die when no one is willing to help. So here I am wanting, willing and able to get this done. And I have no one willing to assist from a committer perspective... The patch works fine in sharded and normal mode. So people can use it today. It is just not committed. I have 4 clients using it in production and one has 100M page views a year, and so far no problems. http://localhost:8983/solr/select?shards=localhost:8983/solr,localhost:7574/solr&indent=true&q=*:*&facet=true&facet.mincount=1&facet.numFacetTerms=2&facet.limit=-1&facet.field=price > Get distinct count of names for a facet field > - > > Key: SOLR-2242 > URL: https://issues.apache.org/jira/browse/SOLR-2242 > Project: Solr > Issue Type: New Feature > Components: Response Writers >Affects Versions: 4.0 >Reporter: Bill Bell >Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-2242.patch, SOLR-2242.shard.patch, > SOLR-2242.solr3.1.patch, SOLR.2242.solr3.1.patch, SOLR.2242.v2.patch > > > When returning facet.field= you will get a list of matches for > distinct values. This is normal behavior. This patch tells you how many > distinct values you have (# of rows). Use with limit=-1 and mincount=1. > The feature is called "namedistinct". Here is an example: > http://localhost:8983/solr/select?q=*:*&facet=true&facet.field=manu&facet.mincount=1&facet.limit=-1&f.manu.facet.namedistinct=0&facet.field=price&f.price.facet.namedistinct=1 > Here is an example on field "hgid" (without namedistinct): > {code} > - > - > 1 > 1 > 1 > 1 > 1 > 5 > 1 > > > {code} > With namedistinct (HGPY045FD36D4000A, HGPY0FBC6690453A9, > HGPY1E44ED6C4FB3B, HGPY1FA631034A1B8, HGPY3317ABAC43B48, > HGPY3A17B2294CB5A, HGPY3ADD2B3D48C39). This returns number of rows > (7), not the number of values (11). > {code} > - > - > 7 > > > {code} > This works actually really good to get total number of fields for a > group.field=hgid. Enjoy! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3200: Attachment: LUCENE-3200.patch same as uwe's patch, but i also nuked the previous hack in TestTermVectorsReader, as MMapDir returns read past EOF now like the others. > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch, LUCENE-3200.patch, LUCENE-3200.patch, > LUCENE-3200.patch, LUCENE-3200_tests.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3200: -- Attachment: LUCENE-3200.patch Same patch with Robert's tests included. > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch, LUCENE-3200.patch, LUCENE-3200.patch, > LUCENE-3200_tests.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3200: -- Attachment: LUCENE-3200.patch New patch with some minor issues fixed: - fixed the RuntimeException - fixed readByte to throw EOF if we are at the end of the n-1 th buffer. as buffer n may be size 0, we will throw BufferUnderFlow in the chatch block. I added hasRemaining() there, so its consistent with readBytes. - The check for an invalid power was bogus (0 is allowed, leads to buffer size 1) - The check for RandomAccessFile too big for maximum buffer size did not respect the additional buffer. nrBuffers can then overflow easily > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch, LUCENE-3200.patch, > LUCENE-3200_tests.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3200: Attachment: LUCENE-3200_tests.patch here are some additional stress tests for mmap > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch, LUCENE-3200_tests.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-2554) RandomSortField values are cached in the FieldCache
[ https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vadim Geshel reopened SOLR-2554: Sorry, I should have been more specific. This happens if you use a RandomSortField in a query, not as a sort criterion: http://localhost:8983/solr/select/?q={!func}random_foo You should immediately see this in stats.jsp#cache, I see this: entry#1 : 'org.apache.lucene.store.MMapDirectory$MMapIndexInput@37f02eaa'=>'random_foo',class org.apache.lucene.search.FieldCache$StringIndex,null=>org.apache.lucene.search.FieldCache$StringIndex#2138852435 I'm using Solr 3.1 > RandomSortField values are cached in the FieldCache > --- > > Key: SOLR-2554 > URL: https://issues.apache.org/jira/browse/SOLR-2554 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.1 >Reporter: Vadim Geshel > > The values of RandomSortField get cached in the FieldCache. When using many > RandomSortFields over time, this leads to running out of memory. > This may be one of the cases already covered in SOLR- but I'm not sure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8823 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/8823/ No tests ran. Build Log (for compile errors): [...truncated 62 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8819 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8819/ No tests ran. Build Log (for compile errors): [...truncated 62 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 8822 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/8822/ No tests ran. Build Log (for compile errors): [...truncated 62 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8818 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8818/ 1 tests failed. REGRESSION: org.apache.solr.common.util.ContentStreamTest.testURLStream Error Message: Server returned HTTP response code: 403 for URL: http://svn.apache.org/repos/asf/lucene/dev/trunk/ Stack Trace: java.io.IOException: Server returned HTTP response code: 403 for URL: http://svn.apache.org/repos/asf/lucene/dev/trunk/ at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1269) at org.apache.solr.common.util.ContentStreamTest.testURLStream(ContentStreamTest.java:74) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1403) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1321) Build Log (for compile errors): [...truncated 8444 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048923#comment-13048923 ] Robert Muir commented on LUCENE-3200: - at a glance the patch is looking really good overall! I'll help with some review and testing. > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-3200: - Assignee: Uwe Schindler > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Attachments: LUCENE-3200.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3200: -- Attachment: LUCENE-3200.patch Here the patch. > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler > Attachments: LUCENE-3200.patch > > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-3.x - Build # 407 - Still Failing
Build: https://builds.apache.org/job/Lucene-3.x/407/ 2 tests failed. FAILED: org.apache.lucene.search.TestPhraseQuery.testRandomPhrases Error Message: this writer hit an OutOfMemoryError; cannot complete optimize Stack Trace: java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot complete optimize at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2527) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2475) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2445) at org.apache.lucene.index.RandomIndexWriter.doRandomOptimize(RandomIndexWriter.java:179) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:195) at org.apache.lucene.index.RandomIndexWriter.getReader(RandomIndexWriter.java:189) at org.apache.lucene.search.TestPhraseQuery.testRandomPhrases(TestPhraseQuery.java:662) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1268) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1186) FAILED: org.apache.lucene.util.fst.TestFSTs.testBigSet Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.resize(HashMap.java:479) at java.util.HashMap.addEntry(HashMap.java:772) at java.util.HashMap.put(HashMap.java:402) at org.apache.lucene.util.fst.TestFSTs$FSTTester.verifyPruned(TestFSTs.java:791) at org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:499) at org.apache.lucene.util.fst.TestFSTs$FSTTester.doTest(TestFSTs.java:363) at org.apache.lucene.util.fst.TestFSTs.doTest(TestFSTs.java:211) at org.apache.lucene.util.fst.TestFSTs.testRandomWords(TestFSTs.java:944) at org.apache.lucene.util.fst.TestFSTs.testBigSet(TestFSTs.java:964) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1268) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1186) Build Log (for compile errors): [...truncated 12491 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2979) Simplify configuration API of contrib Query Parser
[ https://issues.apache.org/jira/browse/LUCENE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phillipe Ramalho updated LUCENE-2979: - Attachment: LUCENE-2979_phillipe_reamalho.patch This is finally my first patch. Sorry for taking so long, but I started changing the API and it broke a lot of code, which took forever to fix. Now it's working and all junits are passing. So far, I changed the entire configuration API. Next step is to write more junits and update/write javadocs. > Simplify configuration API of contrib Query Parser > -- > > Key: LUCENE-2979 > URL: https://issues.apache.org/jira/browse/LUCENE-2979 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/other >Affects Versions: 2.9, 3.0 >Reporter: Adriano Crestani >Assignee: Adriano Crestani > Labels: api-change, gsoc, gsoc2011, lucene-gsoc-11, mentor > Fix For: 3.3 > > Attachments: LUCENE-2979_phillipe_reamalho.patch > > > The current configuration API is very complicated and inherit the concept > used by Attribute API to store token information in token streams. However, > the requirements for both (QP config and token stream) are not the same, so > they shouldn't be using the same thing. > I propose to simplify QP config and make it less scary for people intending > to use contrib QP. The task is not difficult, it will just require a lot of > code change and figure out the best way to do it. That's why it's a good > candidate for a GSoC project. > I would like to hear good proposals about how to make the API more friendly > and less scaring :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2554) RandomSortField values are cached in the FieldCache
[ https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048916#comment-13048916 ] Yonik Seeley commented on SOLR-2554: The reporter probably meant the filterCache (although the filterCache should be sized to avoid OOM errors). Anyway, I plan on starting work soon on a "cache=false" option for queries. > RandomSortField values are cached in the FieldCache > --- > > Key: SOLR-2554 > URL: https://issues.apache.org/jira/browse/SOLR-2554 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.1 >Reporter: Vadim Geshel > > The values of RandomSortField get cached in the FieldCache. When using many > RandomSortFields over time, this leads to running out of memory. > This may be one of the cases already covered in SOLR- but I'm not sure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: tentative! release notes drafts
On Mon, Jun 13, 2011 at 8:14 PM, Chris Hostetter wrote: > > : Since nobody objected to the idea, I created the following templates > : for 3.3, and added a few already-committed things. > > I like the idea ... but why not just keep in in SVN? > > (that way patches can suggest wording for hte release notes if/when the > patch is contains a notable feature worthy of "Release Highlights") > no reason really, I didn't think of putting it in SVN and the wiki was easy at the time... we could just as easy put it in SVN instead... - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: tentative! release notes drafts
: Since nobody objected to the idea, I created the following templates : for 3.3, and added a few already-committed things. I like the idea ... but why not just keep in in SVN? (that way patches can suggest wording for hte release notes if/when the patch is contains a notable feature worthy of "Release Highlights") -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3201) improved compound file handling
[ https://issues.apache.org/jira/browse/LUCENE-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048912#comment-13048912 ] Robert Muir commented on LUCENE-3201: - I think for this one, I prefer to wait for Uwe's refactoring of MMap on LUCENE-3200. Then mmap is simpler, and i think we can even use the same indexinput implementation here. This would mean no slowdown when searching CFS. > improved compound file handling > --- > > Key: LUCENE-3201 > URL: https://issues.apache.org/jira/browse/LUCENE-3201 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Robert Muir > > Currently CompoundFileReader could use some improvements, i see the following > problems > * its CSIndexInput extends bufferedindexinput, which is stupid for > directories like mmap. > * it seeks on every readInternal > * its not possible for a directory to override or improve the handling of > compound files. > for example: it seems if you were impl'ing this thing from scratch, you would > just wrap the II directly (not extend BufferedIndexInput, > and add compound file offset X to seek() calls, and override length(). But of > course, then you couldnt throw read past EOF always when you should, > as a user could read into the next file and be left unaware. > however, some directories could handle this better. for example MMapDirectory > could return an indexinput that simply mmaps the 'slice' of the CFS file. > its underlying bytebuffer etc naturally does bounds checks already etc, so it > wouldnt need to be buffered, not even needing to add any offsets to seek(), > as its position would just work. > So I think we should try to refactor this so that a Directory can customize > how compound files are handled, the simplest > case for the least code change would be to add this to Directory.java: > {code} > public Directory openCompoundInput(String filename) { > return new CompoundFileReader(this, filename); > } > {code} > Because most code depends upon the fact compound files are implemented as a > Directory and transparent. at least then a subclass could override... > but the 'recursion' is a little ugly... we could still label it > expert+internal+experimental or whatever. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3201) improved compound file handling
[ https://issues.apache.org/jira/browse/LUCENE-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048906#comment-13048906 ] Michael McCandless commented on LUCENE-3201: +1 > improved compound file handling > --- > > Key: LUCENE-3201 > URL: https://issues.apache.org/jira/browse/LUCENE-3201 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Robert Muir > > Currently CompoundFileReader could use some improvements, i see the following > problems > * its CSIndexInput extends bufferedindexinput, which is stupid for > directories like mmap. > * it seeks on every readInternal > * its not possible for a directory to override or improve the handling of > compound files. > for example: it seems if you were impl'ing this thing from scratch, you would > just wrap the II directly (not extend BufferedIndexInput, > and add compound file offset X to seek() calls, and override length(). But of > course, then you couldnt throw read past EOF always when you should, > as a user could read into the next file and be left unaware. > however, some directories could handle this better. for example MMapDirectory > could return an indexinput that simply mmaps the 'slice' of the CFS file. > its underlying bytebuffer etc naturally does bounds checks already etc, so it > wouldnt need to be buffered, not even needing to add any offsets to seek(), > as its position would just work. > So I think we should try to refactor this so that a Directory can customize > how compound files are handled, the simplest > case for the least code change would be to add this to Directory.java: > {code} > public Directory openCompoundInput(String filename) { > return new CompoundFileReader(this, filename); > } > {code} > Because most code depends upon the fact compound files are implemented as a > Directory and transparent. at least then a subclass could override... > but the 'recursion' is a little ugly... we could still label it > expert+internal+experimental or whatever. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3201) improved compound file handling
improved compound file handling --- Key: LUCENE-3201 URL: https://issues.apache.org/jira/browse/LUCENE-3201 Project: Lucene - Java Issue Type: Improvement Reporter: Robert Muir Currently CompoundFileReader could use some improvements, i see the following problems * its CSIndexInput extends bufferedindexinput, which is stupid for directories like mmap. * it seeks on every readInternal * its not possible for a directory to override or improve the handling of compound files. for example: it seems if you were impl'ing this thing from scratch, you would just wrap the II directly (not extend BufferedIndexInput, and add compound file offset X to seek() calls, and override length(). But of course, then you couldnt throw read past EOF always when you should, as a user could read into the next file and be left unaware. however, some directories could handle this better. for example MMapDirectory could return an indexinput that simply mmaps the 'slice' of the CFS file. its underlying bytebuffer etc naturally does bounds checks already etc, so it wouldnt need to be buffered, not even needing to add any offsets to seek(), as its position would just work. So I think we should try to refactor this so that a Directory can customize how compound files are handled, the simplest case for the least code change would be to add this to Directory.java: {code} public Directory openCompoundInput(String filename) { return new CompoundFileReader(this, filename); } {code} Because most code depends upon the fact compound files are implemented as a Directory and transparent. at least then a subclass could override... but the 'recursion' is a little ugly... we could still label it expert+internal+experimental or whatever. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2554) RandomSortField values are cached in the FieldCache
[ https://issues.apache.org/jira/browse/SOLR-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-2554. Resolution: Cannot Reproduce Hmm reviewing the code i don't see any way RandomSortField would use the FieldCache. (or ever could have in any previous release) I did some very basic testing with the example solr configs on trunk and i can not reproduce... starting solr up clean, loading the sample data and then executing these queries... * http://localhost:8983/solr/select/?q=*%3A*&sort=random_foo+asc * http://localhost:8983/solr/select/?q=*%3A*&sort=random_bar+asc * http://localhost:8983/solr/select/?q=*%3A*&sort=random_yak+asc ...i got three different orderings, but when i then checked http://localhost:8983/solr/admin/stats.jsp#cache i verified that fieldCache was empty. If you get different results, please re-open and be specific about the version of solr you are using, the steps to reproduce, and the info about fieldCache that you get back from stats.jsp > RandomSortField values are cached in the FieldCache > --- > > Key: SOLR-2554 > URL: https://issues.apache.org/jira/browse/SOLR-2554 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: 3.1 >Reporter: Vadim Geshel > > The values of RandomSortField get cached in the FieldCache. When using many > RandomSortFields over time, this leads to running out of memory. > This may be one of the cases already covered in SOLR- but I'm not sure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
positional index
Hi, I know there is positional index in lucene implementation. If anyone is aware of it, let me know how is it used in lucene, and which algorithms ? Best, --- Minh
[jira] [Commented] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
[ https://issues.apache.org/jira/browse/LUCENE-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048858#comment-13048858 ] Robert Muir commented on LUCENE-3200: - also, we can fix the issue Shai brought up for the 3.1 VOTE while we are here. in seek(long pos) i think we should do: {code} try { ... position() ... } catch (IllegalArgumentException e) { if (pos < 0) throw exc; else throw new IOException("read past EOF"); } {code} This would be more consistent with NIOFS/SimpleFS from an exception perspective. > Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized > of powers of 2 > --- > > Key: LUCENE-3200 > URL: https://issues.apache.org/jira/browse/LUCENE-3200 > Project: Lucene - Java > Issue Type: Improvement >Reporter: Uwe Schindler > > Robert and me discussed a little bit after Mike's investigations, that using > SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot > slowdowns sometimes. > We had the following ideas: > - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the > switching between buffer boundaries is done in exception catch blocks. So > normal code path is always the same like for Single* > - Only the seek method uses strange calculations (the modulo is totally > bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very > strange way of calculating modulo in the original code) > - Because of speed we suggest to no longer use arbitrary buffer sizes. We > should pass only the power of 2 to the indexinput as size. All calculations > in seek and anywhere else would be simple bit shifts and AND operations (the > and masks for the modulo can be calculated in the ctor like NumericUtils does > when calculating precisionSteps). > - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an > issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, > as it will no longer fit page boundaries and mmapping gets harder for the O/S. > We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3200) Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2
Cleanup MMapDirectory to use only one MMapIndexInput impl with mapping sized of powers of 2 --- Key: LUCENE-3200 URL: https://issues.apache.org/jira/browse/LUCENE-3200 Project: Lucene - Java Issue Type: Improvement Reporter: Uwe Schindler Robert and me discussed a little bit after Mike's investigations, that using SingleMMapIndexinput together with MultiMMapIndexInput leads to hotspot slowdowns sometimes. We had the following ideas: - MultiMMapIndexInput is almost as fast as SingleMMapIndexInput, as the switching between buffer boundaries is done in exception catch blocks. So normal code path is always the same like for Single* - Only the seek method uses strange calculations (the modulo is totally bogus, it could be simply: int bufOffset = (int) (pos % maxBufSize); - very strange way of calculating modulo in the original code) - Because of speed we suggest to no longer use arbitrary buffer sizes. We should pass only the power of 2 to the indexinput as size. All calculations in seek and anywhere else would be simple bit shifts and AND operations (the and masks for the modulo can be calculated in the ctor like NumericUtils does when calculating precisionSteps). - the maximum buffer size will now be 2^30, not 2^31-1. But thats not an issue at all. In my opinion, a buffer size of 2^31-1 is stupid in all cases, as it will no longer fit page boundaries and mmapping gets harder for the O/S. We will provide a patch with those cleanups. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3199) Add non-desctructive sort to BytesRefHash
Add non-desctructive sort to BytesRefHash - Key: LUCENE-3199 URL: https://issues.apache.org/jira/browse/LUCENE-3199 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Jason Rutherglen Priority: Minor Currently the BytesRefHash is destructive. We can add a method that returns a non-destructively generated int[]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Facet path
Martijn, If the title is correct "Post grouping faceting" then maybe the bit set based system should be a separate issue? Eg, is there a bit set implementation today in LUCENE-3079? On Mon, Jun 13, 2011 at 2:58 PM, Martijn v Groningen wrote: > There is already an issue open for this: > LUCENE-3079 > > As the issues describes, the faceting in Solr relies on the schema (and off > course the UIF). > So having the noting of a FieldType in the facet module would be very > helpful for selecting the right facet implementation. > Currently in Solr there is only one facet method for field facet that work > per-segment, > but I think in the end we would want all facet types and methods to work on > a per-segment basis. > Martijn > On 13 June 2011 23:47, Jason Rutherglen wrote: >> >> I think it's a better approach than rewriting Solr's internals. Eg, >> small development steps could be taken, using the knowledge learned >> from Solr's facet system. Eg, caching and intersecting bit sets would >> be an easy-ish first step? >> >> On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer >> wrote: >> > I believe people are already looking into that but I am not sure. >> > sounds reasonable to me but I think its going to be lots of work >> > >> > simon >> > >> > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen >> > wrote: >> >> Are we going the direction of creating full facet features outside of >> >> Solr? Eg, we have UIF extrapolated out, we can probably make a module >> >> for bit set intersections as well. In the process the faceting will >> >> go per-segment. >> >> >> >> - >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> > >> > - >> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: dev-h...@lucene.apache.org >> > >> > >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap
[ https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048821#comment-13048821 ] Uwe Schindler commented on LUCENE-3198: --- Thats fine, I just wanted to talk about the whole issue what to enable when and bring together all possible platform possibilities. In general we should per default only enable SimpleFSDirectory on unknown platforms. Maybe NIO is heavily broken on OS XY (Android *lol*)? > Change default Directory impl on 64bit linux to MMap > > > Key: LUCENE-3198 > URL: https://issues.apache.org/jira/browse/LUCENE-3198 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store >Reporter: Michael McCandless > Fix For: 3.3, 4.0 > > > Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle > 1.6.0_21) I see MMapDir getting better search and merge performance when > compared to NIOFSDir. > I think we should fix the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2523) SolrJ QueryResponse doesn't support range facets
[ https://issues.apache.org/jira/browse/SOLR-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048820#comment-13048820 ] Hoss Man commented on SOLR-2523: bq. I'm not a date-math expert, but is there a problem with using the gap w/o having to parse it (i.e. can we always append it?) that is exactly how it was designed to be used. But ultimately i *really* want to implement SOLR-1896 so no client (in any language) ever has to think about any of this. bq. Good idea! This would be really useful for any client. I think we can change this in SolrQueryParser#getRangeQuery() method or in DateField#parseMath(...). i'm a little lost ... I don't understand what "change" is being suggested in this sentence ... can't the client already access both the values and the gap and concat them? > SolrJ QueryResponse doesn't support range facets > > > Key: SOLR-2523 > URL: https://issues.apache.org/jira/browse/SOLR-2523 > Project: Solr > Issue Type: Improvement > Components: clients - java >Reporter: Martijn van Groningen >Assignee: Martijn van Groningen >Priority: Trivial > Fix For: 3.3, 4.0 > > Attachments: SOLR-2523.patch, SOLR-2523.patch > > > It is possible to get date facets and pivot facets in SolrJ. > {code:java} > queryResponse.getFacetDate(); > queryResponse.getFacetPivot(); > {code} > Having this also for range fields would be nice. Adding this is trivial. > Maybe we should deprecate date facet methods in QueryResponse class? Since it > is superseded by range facets. Also some set / add / remove methods for > setting facet range parameters on the SolrQuery class would be nice. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048817#comment-13048817 ] Ryan McKinley commented on SOLR-2588: - "bug" is a strech... I think what this is getting at is that velocity is now required for solr to work at all. With some small changes, Velocity could be optional. I think somethign as easy as: {code} Index: solr/src/java/org/apache/solr/core/SolrCore.java === --- solr/src/java/org/apache/solr/core/SolrCore.java(revision 1134331) +++ solr/src/java/org/apache/solr/core/SolrCore.java(working copy) @@ -1381,7 +1381,12 @@ m.put("ruby", new RubyResponseWriter()); m.put("raw", new RawResponseWriter()); m.put("javabin", new BinaryResponseWriter()); -m.put("velocity", new VelocityResponseWriter()); +try { + m.put("velocity", new VelocityResponseWriter()); +} +catch( Throwable t ) { + log.warn("Error initalizing VelocityResponseWriter", t ); +} m.put("csv", new CSVResponseWriter()); DEFAULT_RESPONSE_WRITERS = Collections.unmodifiableMap(m); } {code} Is all he is talking about... but I'm not sure how/if we want to deal with the error being gobbled... perhaps something smarter to see if Velocity can be created before trying? > Solr doesn't work without Velocity on classpath > --- > > Key: SOLR-2588 > URL: https://issues.apache.org/jira/browse/SOLR-2588 > Project: Solr > Issue Type: Bug >Affects Versions: 3.2 >Reporter: Gunnar Wagenknecht > Fix For: 3.3 > > > In 1.4. it was fine to run Solr without Velocity on the classpath. However, > in 3.2. SolrCore won't load because of a hard reference to the Velocity > response writer in a static initializer. > {noformat} > ... ERROR org.apache.solr.core.CoreContainer - > java.lang.NoClassDefFoundError: org/apache/velocity/context/Context > at org.apache.solr.core.SolrCore.(SolrCore.java:1447) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) > {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap
[ https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048818#comment-13048818 ] Robert Muir commented on LUCENE-3198: - why jump the gun, we can just enable it for linux/64-bit. if others like freebsd or macos X are tested, then we add those to the list, but mmap is a little bit scary to just apply as a blanket default? in all cases it should be like the current logic: if (XYZ_OS && 64_bit && *UNMAP_SUPPORTED*) > Change default Directory impl on 64bit linux to MMap > > > Key: LUCENE-3198 > URL: https://issues.apache.org/jira/browse/LUCENE-3198 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store >Reporter: Michael McCandless > Fix For: 3.3, 4.0 > > > Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle > 1.6.0_21) I see MMapDir getting better search and merge performance when > compared to NIOFSDir. > I think we should fix the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048815#comment-13048815 ] Uwe Schindler commented on SOLR-2588: - I generally also do not use the webapp directly, so thats not uncommon! > Solr doesn't work without Velocity on classpath > --- > > Key: SOLR-2588 > URL: https://issues.apache.org/jira/browse/SOLR-2588 > Project: Solr > Issue Type: Bug >Affects Versions: 3.2 >Reporter: Gunnar Wagenknecht > Fix For: 3.3 > > > In 1.4. it was fine to run Solr without Velocity on the classpath. However, > in 3.2. SolrCore won't load because of a hard reference to the Velocity > response writer in a static initializer. > {noformat} > ... ERROR org.apache.solr.core.CoreContainer - > java.lang.NoClassDefFoundError: org/apache/velocity/context/Context > at org.apache.solr.core.SolrCore.(SolrCore.java:1447) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) > {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Facet path
There is already an issue open for this: LUCENE-3079 As the issues describes, the faceting in Solr relies on the schema (and off course the UIF). So having the noting of a FieldType in the facet module would be very helpful for selecting the right facet implementation. Currently in Solr there is only one facet method for field facet that work per-segment, but I think in the end we would want all facet types and methods to work on a per-segment basis. Martijn On 13 June 2011 23:47, Jason Rutherglen wrote: > I think it's a better approach than rewriting Solr's internals. Eg, > small development steps could be taken, using the knowledge learned > from Solr's facet system. Eg, caching and intersecting bit sets would > be an easy-ish first step? > > On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer > wrote: > > I believe people are already looking into that but I am not sure. > > sounds reasonable to me but I think its going to be lots of work > > > > simon > > > > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen > > wrote: > >> Are we going the direction of creating full facet features outside of > >> Solr? Eg, we have UIF extrapolated out, we can probably make a module > >> for bit set intersections as well. In the process the faceting will > >> go per-segment. > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> > >> > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048814#comment-13048814 ] Mark Miller commented on SOLR-2588: --- Perhaps he is not using the webapp? > Solr doesn't work without Velocity on classpath > --- > > Key: SOLR-2588 > URL: https://issues.apache.org/jira/browse/SOLR-2588 > Project: Solr > Issue Type: Bug >Affects Versions: 3.2 >Reporter: Gunnar Wagenknecht > Fix For: 3.3 > > > In 1.4. it was fine to run Solr without Velocity on the classpath. However, > in 3.2. SolrCore won't load because of a hard reference to the Velocity > response writer in a static initializer. > {noformat} > ... ERROR org.apache.solr.core.CoreContainer - > java.lang.NoClassDefFoundError: org/apache/velocity/context/Context > at org.apache.solr.core.SolrCore.(SolrCore.java:1447) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) > {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap
[ https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048812#comment-13048812 ] Uwe Schindler commented on LUCENE-3198: --- That means we can now enable MMap for all 64 bit platforms? Solaris, windows, Linux - any others except FreeBSD? FreeBSD needs to be checked, but I assume its also faster there. We can check on lucene.zones maybe. > Change default Directory impl on 64bit linux to MMap > > > Key: LUCENE-3198 > URL: https://issues.apache.org/jira/browse/LUCENE-3198 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store >Reporter: Michael McCandless > Fix For: 3.3, 4.0 > > > Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle > 1.6.0_21) I see MMapDir getting better search and merge performance when > compared to NIOFSDir. > I think we should fix the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Facet path
I think it's a better approach than rewriting Solr's internals. Eg, small development steps could be taken, using the knowledge learned from Solr's facet system. Eg, caching and intersecting bit sets would be an easy-ish first step? On Mon, Jun 13, 2011 at 2:37 PM, Simon Willnauer wrote: > I believe people are already looking into that but I am not sure. > sounds reasonable to me but I think its going to be lots of work > > simon > > On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen > wrote: >> Are we going the direction of creating full facet features outside of >> Solr? Eg, we have UIF extrapolated out, we can probably make a module >> for bit set intersections as well. In the process the faceting will >> go per-segment. >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2588) Solr doesn't work without Velocity on classpath
[ https://issues.apache.org/jira/browse/SOLR-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048799#comment-13048799 ] Hoss Man commented on SOLR-2588: I don't understand this bug? in SOLR-1957 the velocity response writer was promoted from being a contrib to being part of the solr core so that the jars are all included in the solr.war and the velocity writer would be one of the writers provided by deault. nothing special should be needed on the classpath. > Solr doesn't work without Velocity on classpath > --- > > Key: SOLR-2588 > URL: https://issues.apache.org/jira/browse/SOLR-2588 > Project: Solr > Issue Type: Bug >Affects Versions: 3.2 >Reporter: Gunnar Wagenknecht > Fix For: 3.3 > > > In 1.4. it was fine to run Solr without Velocity on the classpath. However, > in 3.2. SolrCore won't load because of a hard reference to the Velocity > response writer in a static initializer. > {noformat} > ... ERROR org.apache.solr.core.CoreContainer - > java.lang.NoClassDefFoundError: org/apache/velocity/context/Context > at org.apache.solr.core.SolrCore.(SolrCore.java:1447) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) > at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207) > {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Lucene Facet path
I believe people are already looking into that but I am not sure. sounds reasonable to me but I think its going to be lots of work simon On Mon, Jun 13, 2011 at 11:34 PM, Jason Rutherglen wrote: > Are we going the direction of creating full facet features outside of > Solr? Eg, we have UIF extrapolated out, we can probably make a module > for bit set intersections as well. In the process the faceting will > go per-segment. > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3196. - Resolution: Fixed Committed in revision 1135293. > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Lucene Facet path
Are we going the direction of creating full facet features outside of Solr? Eg, we have UIF extrapolated out, we can probably make a module for bit set intersections as well. In the process the faceting will go per-segment. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048787#comment-13048787 ] Uwe Schindler commented on LUCENE-3196: --- Looks fine, using this approach, separate norms impl can hopefully go away quite fast *g* For the PreFlex codec I even have an idea for the codec and backwards compatibility: The old norms file could be exposed as standard DocValues field by PreFlex codec. The r/w StandardCodec would never write separate norms files, instead simply write docvalues using this 1 byte approach (of course configureable to have e.g. read float norms, and other additional BM25 statistics or whatever). Just ideas, Uwe > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048757#comment-13048757 ] Simon Willnauer commented on LUCENE-3196: - I am planning to commit this soon if nobody objects. > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: commit-check target for ant?
: Ok, I get your point and I'm not going to force it, but I don't agree : people still running 1.5 should be able to compile from sources. I : mean: 1.5 has been dead for a longer while now; the same argument : could be made for java 1.4 or whatever most recent version has been that's an argument for changing our compatibility requirement to 1.6 -- i don't object to having that argument (not sure how i feel about the actual idea given the licensing hubub and lucene's nature as a *library* that lots of people embed in lots of apps - but i digress) but i don't see that as legitimate agrument in favor of having higher overhead for compiling then for running. for this discussion, it shouldn't matter what java version we are talking about, we could have the same argument about requiring 1.7 to compile but supporting binary releases that run on 1.6; or an argument about wether we should use a commercial tool that commiters have a license for to "build" java code from a source grammer -- the point is that as an open source project i think it's really important that *all* our users be allowed to compile from "source". -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2586) example work & logs directories needed?
[ https://issues.apache.org/jira/browse/SOLR-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048755#comment-13048755 ] Uwe Schindler commented on SOLR-2586: - bq. Bottom line I think, is if someone wants to ensure that Solr works well on Tomcat for example, then they should make a patch so that our tests test with this container too (e.g. in hudson, etc). Once its baked in hudson for a while, then I would say its easy for us to recommend it, too. That is the hardest task. Jetty is so cool, because it can be used and configured "embedded". To start up Tomcat, you have to provide final configuration files in the default folder layout and start a main() static method from a class. Something so easy like jettyServer.addServletFilter() & similar things are not possible with Tomcat out of the box. This makes Jetty (in my opinion) the best servlet container around. I sometimes also use it that way (embedded in my Java app). > example work & logs directories needed? > --- > > Key: SOLR-2586 > URL: https://issues.apache.org/jira/browse/SOLR-2586 > Project: Solr > Issue Type: Improvement > Components: Build >Reporter: David Smiley >Priority: Minor > > Firstly, what prompted this issue was me wanting to use a git solr mirror but > finding that git's lack of empty-directory support made the "example" ant > task fail. This task requires examples/work to be in place so that it can > delete its contents. Fixing this was a simple matter of adding: > {code:xml} > > {code} > Right before the delete task. > But then it occurred to me, why even have a "work" directory since Jetty will > apparently use a temp directory instead. -- try for yourself (stdout snippet): > bq. 2011-06-11 00:51:26.177:INFO::Extract > file:/SmileyDev/Search/lucene-solr/solr/example/webapps/solr.war to > /var/folders/zo/zoQJvqc9E0076p0THiri+k+++TI/-Tmp-/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp > On my Mac, this same directory was used for multiple runs, so somehow Jetty > or the VM figures out how to reuse it. > Since this "example" setup isn't a *real* installation -- it's just for > demonstration, arguably it should not contain what it doesn't need. > Likewise, perhaps the empty example/logs directory should be deleted. It's > not used by default any way. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Indexing slower in trunk
On Mon, Jun 13, 2011 at 8:13 PM, Erick Erickson wrote: > I half remember that this has come up before, but I couldn't find the > thread. I was running some tests over the weekend that involved > indexing 1.9M documents from the English Wiki dump. > > I'm consistently seeing that trunk takes about twice as long to index > the docs as 1.4, 3.2 and 3x. Optimize is also taking quite a bit > longer I admit that these aren't very sophisticated tests, and I only > ran the trunk process twice (although both those were consistent). > > I'm pretty sure my rambuffersize and autocommit settings are > identical. I remove the data/index directory before each run. These > results are running the indexing program in IntelliJ, on my Mac, both > the server and the indexing programs were running locally. > > No, trunk isn't compiling before running . > > Here's the server definition: > new StreamingUpdateSolrServer(url, 10, 4); > > and I'm batching up the documents and sending them to Solr in batches of > 1,000. > > So, my question is whether this should be pursued. Note that I'm still > getting around 3K docs/second, which I can't complain about. Not that > that stops me, you understand. And in return for a memory footprint > reduction from 389M to 90M after some off-the-wall sorting and > faceting I'll take it! > > H, speaking of which, the memory usage changes seem like a good > candidate for a page on the Wiki, anyone want to suggest a home? > > > Solr 1.4.1 > Total Time Taken-> 257 seconds > Total documents added-> 1917728 > Docs/sec-> 7461 > starting optimize > optimizing took 26 seconds > > Solr 3.2 > Total Time Taken-> 243 seconds > Total documents added-> 1917728 > Docs/sec-> 7891 > starting optimize > optimizing took 21 seconds > > Solr 3x > Total Time Taken-> 269 seconds > Total documents added-> 1917728 > Docs/sec-> 7129 > starting optimize > optimizing took 21 seconds > > Solr trunk. 2011-6-11: 17:24 EST > Total Time Taken-> 592 seconds > Total documents added-> 1917728 > Docs/sec-> 3239 > starting optimize > optimizing took 159 seconds > > What do folks think? Is there anything I can/should do to narrow this down? Hi Eric, this looks weird, I have some questions: - you are indexing into the same disk as you read the data from? - what are you rambuffer settings? - how many threads are you using to send data to solr? - what is your autocommit setting? simon > > Erick > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2586) example work & logs directories needed?
[ https://issues.apache.org/jira/browse/SOLR-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048746#comment-13048746 ] Yonik Seeley commented on SOLR-2586: bq. why even have a "work" directory since Jetty will apparently use a temp directory instead. Some things have reasons that we barely remember ;-) In this case, I think the main motivating factor might have been SOLR-118 I remember a number of people reporting failing JSPs over time, and it took quite a while to track it down. > example work & logs directories needed? > --- > > Key: SOLR-2586 > URL: https://issues.apache.org/jira/browse/SOLR-2586 > Project: Solr > Issue Type: Improvement > Components: Build >Reporter: David Smiley >Priority: Minor > > Firstly, what prompted this issue was me wanting to use a git solr mirror but > finding that git's lack of empty-directory support made the "example" ant > task fail. This task requires examples/work to be in place so that it can > delete its contents. Fixing this was a simple matter of adding: > {code:xml} > > {code} > Right before the delete task. > But then it occurred to me, why even have a "work" directory since Jetty will > apparently use a temp directory instead. -- try for yourself (stdout snippet): > bq. 2011-06-11 00:51:26.177:INFO::Extract > file:/SmileyDev/Search/lucene-solr/solr/example/webapps/solr.war to > /var/folders/zo/zoQJvqc9E0076p0THiri+k+++TI/-Tmp-/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp > On my Mac, this same directory was used for multiple runs, so somehow Jetty > or the VM figures out how to reuse it. > Since this "example" setup isn't a *real* installation -- it's just for > demonstration, arguably it should not contain what it doesn't need. > Likewise, perhaps the empty example/logs directory should be deleted. It's > not used by default any way. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3097) Post grouping faceting
[ https://issues.apache.org/jira/browse/LUCENE-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated LUCENE-3097: -- Attachment: LUCENE-3097.patch An updated version of the patch. This is still work in progress. I basically rewrote the code in the same way as the other collectors were rewritten for LUCENE-3099. Things todo are creating tests and add some more documentation. This patch only covers the second facet / grouping method. > Post grouping faceting > -- > > Key: LUCENE-3097 > URL: https://issues.apache.org/jira/browse/LUCENE-3097 > Project: Lucene - Java > Issue Type: New Feature > Components: modules/grouping >Reporter: Martijn van Groningen >Assignee: Martijn van Groningen >Priority: Minor > Fix For: 3.3 > > Attachments: LUCENE-3097.patch, LUCENE-3097.patch > > > This issues focuses on implementing post grouping faceting. > * How to handle multivalued fields. What field value to show with the facet. > * Where the facet counts should be based on > ** Facet counts can be based on the normal documents. Ungrouped counts. > ** Facet counts can be based on the groups. Grouped counts. > ** Facet counts can be based on the combination of group value and facet > value. Matrix counts. > And properly more implementation options. > The first two methods are implemented in the SOLR-236 patch. For the first > option it calculates a DocSet based on the individual documents from the > query result. For the second option it calculates a DocSet for all the most > relevant documents of a group. Once the DocSet is computed the FacetComponent > and StatsComponent use one the DocSet to create facets and statistics. > This last one is a bit more complex. I think it is best explained with an > example. Lets say we search on travel offers: > |||hotel||departure_airport||duration|| > |Hotel a|AMS|5 > |Hotel a|DUS|10 > |Hotel b|AMS|5 > |Hotel b|AMS|10 > If we group by hotel and have a facet for airport. Most end users expect > (according to my experience off course) the following airport facet: > AMS: 2 > DUS: 1 > The above result can't be achieved by the first two methods. You either get > counts AMS:3 and DUS:1 or 1 for both airports. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Archive questions
http://mail-archives.apache.org/mod_mbox/lucene-openrelevance-dev/ On 13/06/2011 18:04, Patrick Durusau wrote: Itamar, Sorry to reply to my own answer but yes, the 1969-12-31 date is obviously a software glitch. But, the wiki says the project started in June 2009. So, my question is where are the email archives between June 2009 and September 2010? There may not be any but would be an answer. Hope you are at the start of a great week! Patrick On 06/12/2011 03:59 PM, Patrick Durusau wrote: Itamar, Thanks! Hope you are having a great weekend! Patrick On 6/12/2011 3:47 PM, Itamar Syn-Hershko wrote: Hi, On 12/06/2011 22:42, Patrick Durusau wrote: Questions: 1) The message: http://www.lucidimagination.com/search/document/4e91498fae518260/orp_newbie displays "Date: 1969-12-31" but the text of the message says: " On Tue, Sep 21, 2010 at 7:37 AM, Tommaso Teofili" I was under the impression this project started in 2009? Shouldn't the email archives start in 2009? Thats probably a software glitch. The project is fairly new. 2) The message mentioned above makes reference to the OpenRelevance Viewer: http://www.lucidimagination.com/search/out?u=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FORP%2FOpen%2BRelevance%2BViewer%29 That returns a page not found message. https://cwiki.apache.org/confluence/display/ORP/Open+Relevance+Viewer
RE: Welcome Jan Høydahl as Lucene/Solr committer
Congratulations, Jan! Karl -Original Message- From: ext Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, June 13, 2011 10:43 AM To: dev@lucene.apache.org Subject: Welcome Jan Høydahl as Lucene/Solr committer I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as our newest committer. Jan, if you don't mind, could you introduce yourself with a brief bio as has become our tradition? Congratulations and welcome aboard! - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome Jan ! Shai On Mon, Jun 13, 2011 at 10:25 PM, Dawid Weiss wrote: > Welcome Jan! > > On Mon, Jun 13, 2011 at 6:44 PM, Shalin Shekhar Mangar > wrote: > > Welcome Jan! > > > > On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller > wrote: > >> > >> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl > as > >> our newest committer. > >> > >> Jan, if you don't mind, could you introduce yourself with a brief bio as > >> has become our tradition? > >> > >> Congratulations and welcome aboard! > >> > >> > >> - Mark Miller > >> lucidimagination.com > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> > > > > > > > > -- > > Regards, > > Shalin Shekhar Mangar. > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
[jira] [Updated] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap
[ https://issues.apache.org/jira/browse/LUCENE-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3198: --- Component/s: core/store Fix Version/s: 4.0 3.3 > Change default Directory impl on 64bit linux to MMap > > > Key: LUCENE-3198 > URL: https://issues.apache.org/jira/browse/LUCENE-3198 > Project: Lucene - Java > Issue Type: Improvement > Components: core/store >Reporter: Michael McCandless > Fix For: 3.3, 4.0 > > > Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle > 1.6.0_21) I see MMapDir getting better search and merge performance when > compared to NIOFSDir. > I think we should fix the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3198) Change default Directory impl on 64bit linux to MMap
Change default Directory impl on 64bit linux to MMap Key: LUCENE-3198 URL: https://issues.apache.org/jira/browse/LUCENE-3198 Project: Lucene - Java Issue Type: Improvement Reporter: Michael McCandless Consistently in my NRT testing on Fedora 13 Linux, 64 bit JVM (Oracle 1.6.0_21) I see MMapDir getting better search and merge performance when compared to NIOFSDir. I think we should fix the default. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome Jan! On Mon, Jun 13, 2011 at 6:44 PM, Shalin Shekhar Mangar wrote: > Welcome Jan! > > On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller wrote: >> >> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as >> our newest committer. >> >> Jan, if you don't mind, could you introduce yourself with a brief bio as >> has become our tradition? >> >> Congratulations and welcome aboard! >> >> >> - Mark Miller >> lucidimagination.com >> >> >> >> >> >> >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> > > > > -- > Regards, > Shalin Shekhar Mangar. > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-2341) explore morfologik integration
[ https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned LUCENE-2341: --- Assignee: Dawid Weiss > explore morfologik integration > -- > > Key: LUCENE-2341 > URL: https://issues.apache.org/jira/browse/LUCENE-2341 > Project: Lucene - Java > Issue Type: New Feature > Components: modules/analysis >Reporter: Robert Muir >Assignee: Dawid Weiss > > Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer > available: > http://sourceforge.net/projects/morfologik/ > This works differently than LUCENE-2298, and ideally would be another option > for users. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time
[ https://issues.apache.org/jira/browse/LUCENE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3197: --- Component/s: core/index > Optimize runs forever if you keep deleting docs at the same time > > > Key: LUCENE-3197 > URL: https://issues.apache.org/jira/browse/LUCENE-3197 > Project: Lucene - Java > Issue Type: Bug > Components: core/index >Reporter: Michael McCandless >Priority: Minor > Fix For: 3.3, 4.0 > > > Because we "cascade" merges for an optimize... if you also delete documents > while the merges are running, then the merge policy will see the resulting > single segment as still not optimized (since it has pending deletes) and do a > single-segment merge, and will repeat indefinitely (as long as your app keeps > deleting docs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3197) Optimize runs forever if you keep deleting docs at the same time
Optimize runs forever if you keep deleting docs at the same time Key: LUCENE-3197 URL: https://issues.apache.org/jira/browse/LUCENE-3197 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Priority: Minor Fix For: 3.3, 4.0 Because we "cascade" merges for an optimize... if you also delete documents while the merges are running, then the merge policy will see the resulting single segment as still not optimized (since it has pending deletes) and do a single-segment merge, and will repeat indefinitely (as long as your app keeps deleting docs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Created] (LUCENENET-425) MMapDirectory implementation
MMapDirectory implementation Key: LUCENENET-425 URL: https://issues.apache.org/jira/browse/LUCENENET-425 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Trivial Fix For: Lucene.Net 2.9.4g Attachments: MMapDirectory.patch Since this is not a direct port of MMapDirectory.java, I'll put it under "Support" and implement MMapDirectory as {code} public class MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory { } {code} If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 bit address range), it will default to FSDirectory.FSIndexInput In my tests, I didn't see any performance gain in 32bit environment and I consider it as better then nothing. I would be happy if someone could send test results on 64bit platform. DIGY -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-425) MMapDirectory implementation
[ https://issues.apache.org/jira/browse/LUCENENET-425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-425: --- Attachment: MMapDirectory.patch > MMapDirectory implementation > > > Key: LUCENENET-425 > URL: https://issues.apache.org/jira/browse/LUCENENET-425 > Project: Lucene.Net > Issue Type: New Feature >Affects Versions: Lucene.Net 2.9.4g >Reporter: Digy >Priority: Trivial > Fix For: Lucene.Net 2.9.4g > > Attachments: MMapDirectory.patch > > > Since this is not a direct port of MMapDirectory.java, I'll put it under > "Support" and implement MMapDirectory as > {code} > public class > MMapDirectory:Lucene.Net.Support.MemoryMappedDirectory:Lucene.Net.Support.MemoryMappedDirectory > { > } > {code} > If a Mem-Map can not be created(for ex, if the file is too big to fit in 32 > bit address range), it will default to FSDirectory.FSIndexInput > In my tests, I didn't see any performance gain in 32bit environment and I > consider it as better then nothing. > I would be happy if someone could send test results on 64bit platform. > DIGY -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (LUCENE-3193) TwoPhaseCommit interface
[ https://issues.apache.org/jira/browse/LUCENE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3193. Resolution: Fixed Committed revision 1135204 (trunk). Committed revision 1135215 (3x). > TwoPhaseCommit interface > > > Key: LUCENE-3193 > URL: https://issues.apache.org/jira/browse/LUCENE-3193 > Project: Lucene - Java > Issue Type: New Feature > Components: core/index >Reporter: Shai Erera >Assignee: Shai Erera > Fix For: 3.3, 4.0 > > Attachments: LUCENE-3193.patch, LUCENE-3193.patch > > > I would like to propose a TwoPhaseCommit interface which declares the methods > necessary to implement a 2-phase commit algorithm: > * prepareCommit() > * commit() > * rollback() > The prepare/commit ones have variants that take a (Map > commitData) following the ones we have in IndexWriter. > In addition, a TwoPhaseCommitTool which implements a 2-phase commit amongst > several TPCs. > Having IndexWriter implement that interface will allow running the 2-phase > commit algorithm on multiple IWs or IW + any other object that implements the > interface. > We should mark the interface @lucene.internal so as to not block ourselves in > the future. This is pretty advanced stuff anyway. > Will post a patch soon -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete
[ https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe resolved SOLR-2590. --- Resolution: Fixed Committed: - r1135206: trunk - r1135207: branch_3x > javadoc.link.lucene property value in solr/common-build.xml is obsolete > --- > > Key: SOLR-2590 > URL: https://issues.apache.org/jira/browse/SOLR-2590 > Project: Solr > Issue Type: Bug > Components: Build >Affects Versions: 4.0 >Reporter: Steven Rowe >Assignee: Steven Rowe >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: SOLR-2590.patch > > > The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target > no longer works. > From > https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText : > {noformat} > [javadoc] javadoc: warning - Error fetching URL: > https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list > ... > BUILD FAILED > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389: > Javadocs warnings were found! > {noformat} > The link should instead be > https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Indexing slower in trunk
I half remember that this has come up before, but I couldn't find the thread. I was running some tests over the weekend that involved indexing 1.9M documents from the English Wiki dump. I'm consistently seeing that trunk takes about twice as long to index the docs as 1.4, 3.2 and 3x. Optimize is also taking quite a bit longer I admit that these aren't very sophisticated tests, and I only ran the trunk process twice (although both those were consistent). I'm pretty sure my rambuffersize and autocommit settings are identical. I remove the data/index directory before each run. These results are running the indexing program in IntelliJ, on my Mac, both the server and the indexing programs were running locally. No, trunk isn't compiling before running . Here's the server definition: new StreamingUpdateSolrServer(url, 10, 4); and I'm batching up the documents and sending them to Solr in batches of 1,000. So, my question is whether this should be pursued. Note that I'm still getting around 3K docs/second, which I can't complain about. Not that that stops me, you understand. And in return for a memory footprint reduction from 389M to 90M after some off-the-wall sorting and faceting I'll take it! H, speaking of which, the memory usage changes seem like a good candidate for a page on the Wiki, anyone want to suggest a home? Solr 1.4.1 Total Time Taken-> 257 seconds Total documents added-> 1917728 Docs/sec-> 7461 starting optimize optimizing took 26 seconds Solr 3.2 Total Time Taken-> 243 seconds Total documents added-> 1917728 Docs/sec-> 7891 starting optimize optimizing took 21 seconds Solr 3x Total Time Taken-> 269 seconds Total documents added-> 1917728 Docs/sec-> 7129 starting optimize optimizing took 21 seconds Solr trunk. 2011-6-11: 17:24 EST Total Time Taken-> 592 seconds Total documents added-> 1917728 Docs/sec-> 3239 starting optimize optimizing took 159 seconds What do folks think? Is there anything I can/should do to narrow this down? Erick - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete
[ https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2590: -- Attachment: SOLR-2590.patch Patch for trunk with the fixed Jenkins-built Lucene trunk javadocs URL. Locally, "ant javadoc" under solr/ succeeds. Committing shortly. > javadoc.link.lucene property value in solr/common-build.xml is obsolete > --- > > Key: SOLR-2590 > URL: https://issues.apache.org/jira/browse/SOLR-2590 > Project: Solr > Issue Type: Bug > Components: Build >Affects Versions: 4.0 >Reporter: Steven Rowe >Assignee: Steven Rowe >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: SOLR-2590.patch > > > The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target > no longer works. > From > https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText : > {noformat} > [javadoc] javadoc: warning - Error fetching URL: > https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list > ... > BUILD FAILED > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389: > Javadocs warnings were found! > {noformat} > The link should instead be > https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete
[ https://issues.apache.org/jira/browse/SOLR-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048676#comment-13048676 ] Steven Rowe commented on SOLR-2590: --- If you paste the old link into a browser, you get redirected to https://builds.apache.org/ - I guess the admins got tired of supporting the old Hudson links? > javadoc.link.lucene property value in solr/common-build.xml is obsolete > --- > > Key: SOLR-2590 > URL: https://issues.apache.org/jira/browse/SOLR-2590 > Project: Solr > Issue Type: Bug > Components: Build >Affects Versions: 4.0 >Reporter: Steven Rowe >Assignee: Steven Rowe >Priority: Minor > Fix For: 3.3, 4.0 > > > The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target > no longer works. > From > https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText : > {noformat} > [javadoc] javadoc: warning - Error fetching URL: > https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list > ... > BUILD FAILED > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213: > The following error occurred while executing this line: > /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389: > Javadocs warnings were found! > {noformat} > The link should instead be > https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-2590) javadoc.link.lucene property value in solr/common-build.xml is obsolete
javadoc.link.lucene property value in solr/common-build.xml is obsolete --- Key: SOLR-2590 URL: https://issues.apache.org/jira/browse/SOLR-2590 Project: Solr Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.3, 4.0 The link to the Jenkins-built Lucene javadocs used by Solr's "javadoc" target no longer works. >From >https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/consoleText : {noformat} [javadoc] javadoc: warning - Error fetching URL: https://hudson.apache.org/hudson/job/Lucene-trunk/javadoc/all/package-list ... BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build.xml:213: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/common-build.xml:389: Javadocs warnings were found! {noformat} The link should instead be https://builds.apache.org/job/Lucene-trunk/javadoc/all/package-list -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: repo doesn't like me this morning.
I'm getting updates again, apparently the problem is fixed. On Mon, Jun 13, 2011 at 12:36 PM, Upayavira wrote: > Is the EU one getting updates? I've seen a suggestion that commits from > the US aren't getting to it. > > Upayavira > > On Mon, 13 Jun 2011 10:05 -0400, "Robert Muir" wrote: >> Yeah, but you can svn update/checkout/make patches etc until the main >> one comes back online, then switch back for committing >> >> On Mon, Jun 13, 2011 at 10:03 AM, Uwe Schindler wrote: >> > The European one works read only. Committing of course also fails. >> > >> > Uwe >> > -- >> > Uwe Schindler >> > H.-H.-Meier-Allee 63, 28213 Bremen >> > http://www.thetaphi.de >> > >> > >> > >> > Robert Muir schrieb: >> >> >> >> svn switch --relocate >> >> https://svn.apache.org/repos/asf/lucene/dev/trunk >> >> https://svn.eu.apache.org/repos/asf/lucene/dev/trunk >> >> >> >> On Mon, Jun 13, 2011 at 9:55 AM, Uwe Schindler wrote: >> >> > It's currently broken. See monitoring page. >> >> > >> >> > Uwe >> >> > -- >> >> > Uwe Schindler >> >> > H.-H.-Meier-Allee 63, 28213 Bremen >> >> > http://www.thetaphi.de >> >> > >> >> > >> >> > >> >> > Erick Erickson schrieb: >> >> >> >> >> >> Trying to do a simple "svn update" on the trunk gives me an "svn: >> >> >> access to ' >> >> http://svn.apache.org/repos/asf/lucene/dev/trunk< >> >> /a>' >> >> >> forbidden" error. This if from the shell on OS X. I was able to check >> >> >> out the trunk last night >> >> >> >> >> >> Are others seeing this or am I special (perhaps part way through >> >> >> getting full rights)? Or do I have to authenticate now? It's no huge >> >> >> deal, I'm not in a particular hurry if it's just me, but if it's >> >> >> everybody it's more serious... >> >> >> >> >> >> Thanks, >> >> >> Erick >> >> >> >> >> >> >> >> >> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> >> >> >> > >> >> >> >> >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> >> >> > >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > --- > Enterprise Search Consultant at Sourcesense UK, > Making Sense of Open Source > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-2574) Add SLF4J-nop dependency
[ https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reopened SOLR-2574: - Assignee: Shalin Shekhar Mangar slf4j in branch 3x still needs to be updated to 1.6 Thanks for reminding Gabriele :) > Add SLF4J-nop dependency > > > Key: SOLR-2574 > URL: https://issues.apache.org/jira/browse/SOLR-2574 > Project: Solr > Issue Type: Bug >Reporter: Gabriele Kahlout >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 4.0 > > Attachments: solrjtest.zip > > > Whatever the merits of slf4j, a quick solrj test should work. > I've attached a sample 1-line project with dependency on solrj-3.2 on run it > prints: > {code} > java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72) > at com.mysimpatico.solrjtest.App.main(App.java:12) > {code} > Uncomment the nop dependency and it will work. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2574) Add SLF4J-nop dependency
[ https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048666#comment-13048666 ] Gabriele Kahlout commented on SOLR-2574: thank you, what about 3_x branch? > Add SLF4J-nop dependency > > > Key: SOLR-2574 > URL: https://issues.apache.org/jira/browse/SOLR-2574 > Project: Solr > Issue Type: Bug >Reporter: Gabriele Kahlout >Priority: Minor > Fix For: 4.0 > > Attachments: solrjtest.zip > > > Whatever the merits of slf4j, a quick solrj test should work. > I've attached a sample 1-line project with dependency on solrj-3.2 on run it > prints: > {code} > java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72) > at com.mysimpatico.solrjtest.App.main(App.java:12) > {code} > Uncomment the nop dependency and it will work. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2551) Checking dataimport.properties for write access during startup
[ https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar reassigned SOLR-2551: --- Assignee: Shalin Shekhar Mangar > Checking dataimport.properties for write access during startup > -- > > Key: SOLR-2551 > URL: https://issues.apache.org/jira/browse/SOLR-2551 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4.1, 3.1 >Reporter: C S >Assignee: Shalin Shekhar Mangar >Priority: Minor > > A common mistake is that the /conf (respectively the dataimport.properties) > file is not writable for solr. It would be great if that were detected on > starting a dataimport job. > Currently and import might grind away for days and fail if it can't write its > timestamp to the dataimport.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2551) Checking dataimport.properties for write access during startup
[ https://issues.apache.org/jira/browse/SOLR-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048662#comment-13048662 ] Shalin Shekhar Mangar commented on SOLR-2551: - If DIH is unable to write dataimport.properties, it logs a message saying so. We don't want the import to fail in this case because a lot of people use only full-imports which does not need the dataimport.properties at all. What do you suggest? > Checking dataimport.properties for write access during startup > -- > > Key: SOLR-2551 > URL: https://issues.apache.org/jira/browse/SOLR-2551 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler >Affects Versions: 1.4.1, 3.1 >Reporter: C S >Priority: Minor > > A common mistake is that the /conf (respectively the dataimport.properties) > file is not writable for solr. It would be great if that were detected on > starting a dataimport job. > Currently and import might grind away for days and fail if it can't write its > timestamp to the dataimport.properties file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 8811 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/8811/ All tests passed Build Log (for compile errors): [...truncated 13695 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2574) Add SLF4J-nop dependency
[ https://issues.apache.org/jira/browse/SOLR-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-2574. - Resolution: Fixed Fix Version/s: (was: 1.4) 4.0 slf4j is v1.6.1 in trunk > Add SLF4J-nop dependency > > > Key: SOLR-2574 > URL: https://issues.apache.org/jira/browse/SOLR-2574 > Project: Solr > Issue Type: Bug >Reporter: Gabriele Kahlout >Priority: Minor > Fix For: 4.0 > > Attachments: solrjtest.zip > > > Whatever the merits of slf4j, a quick solrj test should work. > I've attached a sample 1-line project with dependency on solrj-3.2 on run it > prints: > {code} > java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory > at > org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.(CommonsHttpSolrServer.java:72) > at com.mysimpatico.solrjtest.App.main(App.java:12) > {code} > Uncomment the nop dependency and it will work. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2136) Function Queries: if() function
[ https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048646#comment-13048646 ] Yonik Seeley commented on SOLR-2136: Thanks Koji, I just committed a fix for this cut'n'paste error. > Function Queries: if() function > --- > > Key: SOLR-2136 > URL: https://issues.apache.org/jira/browse/SOLR-2136 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4.1 >Reporter: Jan Høydahl > Fix For: 4.0 > > Attachments: SOLR-2136.patch, SOLR-2136.patch > > > Add an if() function which will enable conditional function queries. > The function could be modeled after a spreadsheet if function (e.g: > http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function) > IF(test; value1; value2) where: > test is or refers to a logical value or expression that returns a logical > value (TRUE or FALSE). > value1 is the value that is returned by the function if test yields TRUE. > value2 is the value that is returned by the function if test yields FALSE. > If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it > is assumed to be TRUE. > Example use: > if(color=="red"; 100; if(color=="green"; 50; 25)) > This function will check the document field "color", and if it is "red" > return 100, if it is "green" return 50, else return 25. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome Jan! On Mon, Jun 13, 2011 at 8:13 PM, Mark Miller wrote: > I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as > our newest committer. > > Jan, if you don't mind, could you introduce yourself with a brief bio as > has become our tradition? > > Congratulations and welcome aboard! > > > - Mark Miller > lucidimagination.com > > > > > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Regards, Shalin Shekhar Mangar.
RE: Welcome Jan Høydahl as Lucene/Solr committer
Welcome! > -Original Message- > From: Mark Miller [mailto:markrmil...@gmail.com] > Sent: Monday, June 13, 2011 10:43 AM > To: dev@lucene.apache.org > Subject: Welcome Jan Høydahl as Lucene/Solr committer > > I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl > as our newest committer. > > Jan, if you don't mind, could you introduce yourself with a brief bio as > has become our tradition? > > Congratulations and welcome aboard! > > > - Mark Miller > lucidimagination.com > > > > > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2136) Function Queries: if() function
[ https://issues.apache.org/jira/browse/SOLR-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048642#comment-13048642 ] Koji Sekiguchi commented on SOLR-2136: -- I've met a strange behavior. With empty index, start example solr. Then hit: http://localhost:8983/solr/select/?q={!func}if(exists(f1_b),10,20)&debug=results you got an empty xml as expected. Then hit the above URL again, you got the following exception: {code} SEVERE: java.lang.ClassCastException: org.apache.solr.search.ValueSourceParser$60$1 cannot be cast to org.apache.solr.search.function.SingleFunction at org.apache.solr.search.function.SimpleBoolFunction.equals(SimpleBoolFunction.java:66) at org.apache.solr.search.function.IfFunction.equals(IfFunction.java:137) at org.apache.solr.search.function.FunctionQuery.equals(FunctionQuery.java:202) at org.apache.solr.search.QueryResultKey.equals(QueryResultKey.java:78) at java.util.HashMap.getEntry(HashMap.java:349) at java.util.LinkedHashMap.get(LinkedHashMap.java:280) at org.apache.solr.search.LRUCache.get(LRUCache.java:129) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:991) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:346) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:441) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:239) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1308) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:353) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:248) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) {code} > Function Queries: if() function > --- > > Key: SOLR-2136 > URL: https://issues.apache.org/jira/browse/SOLR-2136 > Project: Solr > Issue Type: New Feature > Components: search >Affects Versions: 1.4.1 >Reporter: Jan Høydahl > Fix For: 4.0 > > Attachments: SOLR-2136.patch, SOLR-2136.patch > > > Add an if() function which will enable conditional function queries. > The function could be modeled after a spreadsheet if function (e.g: > http://wiki.services.openoffice.org/wiki/Documentation/How_Tos/Calc:_IF_function) > IF(test; value1; value2) where: > test is or refers to a logical value or expression that returns a logical > value (TRUE or FALSE). > value1 is the value that is returned by the function if test yields TRUE. > value2 is the value that is returned by the function if test yields FALSE. > If value2 is omitted it is assumed to be FALSE; if value1 is also omitted it > is assumed to be TRUE. > Example use: > if(color=="red"; 100; if(color=="green"; 50; 25)) > This function will check the document field "color", and if it is "red" > return 100, if it is "green" return 50, else return 25. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: repo doesn't like me this morning.
Is the EU one getting updates? I've seen a suggestion that commits from the US aren't getting to it. Upayavira On Mon, 13 Jun 2011 10:05 -0400, "Robert Muir" wrote: > Yeah, but you can svn update/checkout/make patches etc until the main > one comes back online, then switch back for committing > > On Mon, Jun 13, 2011 at 10:03 AM, Uwe Schindler wrote: > > The European one works read only. Committing of course also fails. > > > > Uwe > > -- > > Uwe Schindler > > H.-H.-Meier-Allee 63, 28213 Bremen > > http://www.thetaphi.de > > > > > > > > Robert Muir schrieb: > >> > >> svn switch --relocate > >> https://svn.apache.org/repos/asf/lucene/dev/trunk > >> https://svn.eu.apache.org/repos/asf/lucene/dev/trunk > >> > >> On Mon, Jun 13, 2011 at 9:55 AM, Uwe Schindler wrote: > >> > It's currently broken. See monitoring page. > >> > > >> > Uwe > >> > -- > >> > Uwe Schindler > >> > H.-H.-Meier-Allee 63, 28213 Bremen > >> > http://www.thetaphi.de > >> > > >> > > >> > > >> > Erick Erickson schrieb: > >> >> > >> >> Trying to do a simple "svn update" on the trunk gives me an "svn: > >> >> access to ' > >> http://svn.apache.org/repos/asf/lucene/dev/trunk< > >> /a>' > >> >> forbidden" error. This if from the shell on OS X. I was able to check > >> >> out the trunk last night > >> >> > >> >> Are others seeing this or am I special (perhaps part way through > >> >> getting full rights)? Or do I have to authenticate now? It's no huge > >> >> deal, I'm not in a particular hurry if it's just me, but if it's > >> >> everybody it's more serious... > >> >> > >> >> Thanks, > >> >> Erick > >> >> > >> >> > >> > >> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> >> > >> >> > >> > > >> > >> > >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >> For additional commands, e-mail: dev-h...@lucene.apache.org > >> > >> > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --- Enterprise Search Consultant at Sourcesense UK, Making Sense of Open Source - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome Jan! On 13 June 2011 17:30, Koji Sekiguchi wrote: > Welcome! > > > (11/06/13 23:43), Mark Miller wrote: > >> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as >> our newest committer. >> >> Jan, if you don't mind, could you introduce yourself with a brief bio as >> has become our tradition? >> >> Congratulations and welcome aboard! >> >> >> - Mark Miller >> lucidimagination.com >> >> >> >> >> >> >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> >> > > -- > http://www.rondhuit.com/en/ > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > -- Met vriendelijke groet, Martijn van Groningen
Participation Requested: Survey about Open-Source Software Development
Hi, Drs. Jeffrey Carver, Rosanna Guadagno, Debra McCallum, and Mr. Amiangshu Bosu, University of Alabama, and Dr. Lorin Hochstein, University of Southern California, are conducting a survey of open-source software developers. This survey seeks to understand how developers on distributed, virtual teams, like open-source projects, interact with each other to accomplish their tasks. You must be at least 19 years of age to complete the survey. The survey should take approximately 15 minutes to complete. If you are actively participating as a developer, please consider completing our survey. Here is the link to the survey: http://goo.gl/HQnux We apologize for inconvenience and if you receive multiple copies of this email. This survey has been approved by The University of Alabama IRB board. Thanks, Dr. Jeffrey Carver Assistant Professor University of Alabama (v) 205-348-9829 (f) 205-348-0219 http://www.cs.ua.edu/~carver - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2400) FieldAnalysisRequestHandler; add information about token-relation
[ https://issues.apache.org/jira/browse/SOLR-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-2400. - Resolution: Fixed Committed trunk revision: 1135154 Committed 3.x branch revision: 1135156 > FieldAnalysisRequestHandler; add information about token-relation > - > > Key: SOLR-2400 > URL: https://issues.apache.org/jira/browse/SOLR-2400 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Stefan Matheis (steffkes) >Assignee: Uwe Schindler >Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: 110303_FieldAnalysisRequestHandler_output.xml, > 110303_FieldAnalysisRequestHandler_view.png, SOLR-2400-revision1.patch, > SOLR-2400-revision1.patch, SOLR-2400.patch, SOLR-2400.patch, SOLR-2400.patch, > SOLR-2400.patch, field.xml > > > The XML-Output (simplified example attached) is missing one small information > .. which could be very useful to build an nice Analysis-Output, and that's > "Token-Relation" (if there is special/correct word for this, please correct > me). > Meaning, that is actually not possible to "follow" the Analysis-Process > (completly) while the Tokenizers/Filters will drop out Tokens (f.e. StopWord) > or split it into multiple Tokens (f.e. WordDelimiter). > Would it be possible to include this Information? If so, it would be possible > to create an improved Analysis-Page for the new Solr Admin (SOLR-2399) - > short scribble attached -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-2878) Allow Scorer to expose positions and payloads aka. nuke spans
[ https://issues.apache.org/jira/browse/LUCENE-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2878: Attachment: LUCENE-2878_trunk.patch here is a patch that applies to trunk. I added a simple maybe slowish PositionTermScorer that is used when pos are required. This is really work in progress but I am uploading it just in case somebody is interested. > Allow Scorer to expose positions and payloads aka. nuke spans > -- > > Key: LUCENE-2878 > URL: https://issues.apache.org/jira/browse/LUCENE-2878 > Project: Lucene - Java > Issue Type: Improvement > Components: core/search >Affects Versions: Bulk Postings branch >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Labels: gsoc2011, lucene-gsoc-11, mentor > Attachments: LUCENE-2878.patch, LUCENE-2878.patch, LUCENE-2878.patch, > LUCENE-2878.patch, LUCENE-2878_trunk.patch > > > Currently we have two somewhat separate types of queries, the one which can > make use of positions (mainly spans) and payloads (spans). Yet Span*Query > doesn't really do scoring comparable to what other queries do and at the end > of the day they are duplicating lot of code all over lucene. Span*Queries are > also limited to other Span*Query instances such that you can not use a > TermQuery or a BooleanQuery with SpanNear or anthing like that. > Beside of the Span*Query limitation other queries lacking a quiet interesting > feature since they can not score based on term proximity since scores doesn't > expose any positional information. All those problems bugged me for a while > now so I stared working on that using the bulkpostings API. I would have done > that first cut on trunk but TermScorer is working on BlockReader that do not > expose positions while the one in this branch does. I started adding a new > Positions class which users can pull from a scorer, to prevent unnecessary > positions enums I added ScorerContext#needsPositions and eventually > Scorere#needsPayloads to create the corresponding enum on demand. Yet, > currently only TermQuery / TermScorer implements this API and other simply > return null instead. > To show that the API really works and our BulkPostings work fine too with > positions I cut over TermSpanQuery to use a TermScorer under the hood and > nuked TermSpans entirely. A nice sideeffect of this was that the Position > BulkReading implementation got some exercise which now :) work all with > positions while Payloads for bulkreading are kind of experimental in the > patch and those only work with Standard codec. > So all spans now work on top of TermScorer ( I truly hate spans since today ) > including the ones that need Payloads (StandardCodec ONLY)!! I didn't bother > to implement the other codecs yet since I want to get feedback on the API and > on this first cut before I go one with it. I will upload the corresponding > patch in a minute. > I also had to cut over SpanQuery.getSpans(IR) to > SpanQuery.getSpans(AtomicReaderContext) which I should probably do on trunk > first but after that pain today I need a break first :). > The patch passes all core tests > (org.apache.lucene.search.highlight.HighlighterTest still fails but I didn't > look into the MemoryIndex BulkPostings API yet) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048616#comment-13048616 ] Michael McCandless commented on LUCENE-3196: Ahh yes great! selckin's random number generator should hit 1 frequently ;) > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048615#comment-13048615 ] Simon Willnauer commented on LUCENE-3196: - bq. Do we have a test (eg a random test that picks random fixed byte[] size) that covers this...? yes the fixed length is selected at random in the tests I fixed that in the patch too. > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3196) Optimize FixedStraightBytes for bytes size == 1
[ https://issues.apache.org/jira/browse/LUCENE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048606#comment-13048606 ] Michael McCandless commented on LUCENE-3196: Looks good Simon! Probably other smallish sizes (2, 3, 4, ...) could be a single array too, ie paged or not should be separately controllable, but we can do that later; this is a great baby step since we need this for norms cutover. Do we have a test (eg a random test that picks random fixed byte[] size) that covers this...? > Optimize FixedStraightBytes for bytes size == 1 > --- > > Key: LUCENE-3196 > URL: https://issues.apache.org/jira/browse/LUCENE-3196 > Project: Lucene - Java > Issue Type: Improvement > Components: core/index, core/search >Affects Versions: 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer >Priority: Minor > Fix For: 4.0 > > Attachments: LUCENE-3196.patch > > > Currently we read all the bytes in a PagedBytes instance wich is unneeded for > single byte values like norms. For fast access this should simply be a > straight array. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3193) TwoPhaseCommit interface
[ https://issues.apache.org/jira/browse/LUCENE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048605#comment-13048605 ] Michael McCandless commented on LUCENE-3193: Looks great Shai! > TwoPhaseCommit interface > > > Key: LUCENE-3193 > URL: https://issues.apache.org/jira/browse/LUCENE-3193 > Project: Lucene - Java > Issue Type: New Feature > Components: core/index >Reporter: Shai Erera >Assignee: Shai Erera > Fix For: 3.3, 4.0 > > Attachments: LUCENE-3193.patch, LUCENE-3193.patch > > > I would like to propose a TwoPhaseCommit interface which declares the methods > necessary to implement a 2-phase commit algorithm: > * prepareCommit() > * commit() > * rollback() > The prepare/commit ones have variants that take a (Map > commitData) following the ones we have in IndexWriter. > In addition, a TwoPhaseCommitTool which implements a 2-phase commit amongst > several TPCs. > Having IndexWriter implement that interface will allow running the 2-phase > commit algorithm on multiple IWs or IW + any other object that implements the > interface. > We should mark the interface @lucene.internal so as to not block ourselves in > the future. This is pretty advanced stuff anyway. > Will post a patch soon -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome! (11/06/13 23:43), Mark Miller wrote: I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as our newest committer. Jan, if you don't mind, could you introduce yourself with a brief bio as has become our tradition? Congratulations and welcome aboard! - Mark Miller lucidimagination.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- http://www.rondhuit.com/en/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome! Mike McCandless http://blog.mikemccandless.com On Mon, Jun 13, 2011 at 10:43 AM, Mark Miller wrote: > I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as > our newest committer. > > Jan, if you don't mind, could you introduce yourself with a brief bio as has > become our tradition? > > Congratulations and welcome aboard! > > > - Mark Miller > lucidimagination.com > > > > > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Jan Høydahl as Lucene/Solr committer
Welcome! On Mon, Jun 13, 2011 at 5:04 PM, Robert Muir wrote: > Welcome Jan! > > On Mon, Jun 13, 2011 at 10:43 AM, Mark Miller wrote: >> I'm happy to announce that the Lucene/Solr PMC has voted in Jan Høydahl as >> our newest committer. >> >> Jan, if you don't mind, could you introduce yourself with a brief bio as has >> become our tradition? >> >> Congratulations and welcome aboard! >> >> >> - Mark Miller >> lucidimagination.com >> >> >> >> >> >> >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: dev-h...@lucene.apache.org >> >> > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org