[jira] Reopened: (SOLR-2411) Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it
[ https://issues.apache.org/jira/browse/SOLR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe reopened SOLR-2411: --- bq. After this commit, solr cannot load jars in example Not good. bq. We've really gone backwards from a working build to a non-working build. Yeah, that's bad. bq. ant dist always created jars and wars under /dist, and solr configs refer to those. I hadn't realized that. {quote} ant dist, ant example, etc, should result in the same directory structure as what you get in a binary download. But after this change, a binary download still contains the jars in the /dist directory (and basically, so should the results of ant dist). I think the only cleanup needed here is to create a packages directory where the results of ant package go, rather than going into dist along with the other jars. {quote} I agree - I'll revert and work up a patch implementing this. Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it - Key: SOLR-2411 URL: https://issues.apache.org/jira/browse/SOLR-2411 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: SOLR-2411.patch Build targets dist, dist-*, create-package, package, package-src, etc. use {{dist/}} as a landing spot for intermediate .jar files which will not be individually shipped. These targets should instead use {{solr/build/}} to hold these intermediate .jars. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004127#comment-13004127 ] Stefan Matheis (steffkes) edited comment on SOLR-2399 at 3/8/11 6:39 PM: - Short Update, mainly Things that Jan mentioned on the Mailinglist: bq. Move the links on top to bottom, reserving the top for navigation. done bq. The send email could be changed to Community forum and instead of linking to mailto:solr-u...@lucene.apache.org, link to http://wiki.apache.org/solr/UsingMailingLists done bq. Add a link to IRC chat. http://webchat.freenode.net/?channels=#solr That would surely increase the activity on the channel :) done {quote}Include a Dev/Test/Prod indication: It is common to have three different environments, one for test, one for development and one live production. It happens now and then that you do the wrong action on the wrong server :( so a visual clue as to which environemnt you're in is very useful. I propose a simple solid bar on the very top which is RED for prod, YELLOW for test and GREEN for dev. Would it be possible to read a Java system property -Dsolr.environment=dev and based on that set the color of such a top-bar?{quote} also done - and yes, reading commandline-args is possible =) currently, the interface will check {{solr.environment}} and display it's value at the upper right corner. additionally, if the value starts with {{dev|test|prod}} the box gets some color-highlight Few Minutes ago, pushed the last Update to my GitHub-Repo which includes a (very) basic Tree-View for Zookeeper-Data. and also yes .. that (re-)introduced zookeeper.jsp, which is use for generating json-structured zk-tree-data - based on the existing one, just rearranged a few things was (Author: steffkes): Short Update, mainly Things that Jan mentioned on the Mailinglist: bq. Move the links on top to bottom, reserving the top for navigation. done bq. The send email could be changed to Community forum and instead of linking to mailto:solr-u...@lucene.apache.org, link to http://wiki.apache.org/solr/UsingMailingLists done bq. Add a link to IRC chat. http://webchat.freenode.net/?channels=#solr That would surely increase the activity on the channel :) done {quote}Include a Dev/Test/Prod indication: It is common to have three different environments, one for test, one for development and one live production. It happens now and then that you do the wrong action on the wrong server :( so a visual clue as to which environemnt you're in is very useful. I propose a simple solid bar on the very top which is RED for prod, YELLOW for test and GREEN for dev. Would it be possible to read a Java system property -Dsolr.environment=dev and based on that set the color of such a top-bar?{quote} also done - and yes, reading commandline-args is possible =) currently, the interface will check {{solr.environment}} and display it's value at the upper right corner. additionally, if the value starts with {{dev|test|prod}} the box gets some color-highlight Few Minutes ago, pushed the last Update to my GitHub-Repo which includes a (very) basic Tree-View for Zookeeper-Data. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin [This commit shows the differences|https://github.com/steffkes/solr-admin/commit/5f80bb0ea9deb4b94162632912fe63386f869e0d] between old/existing index.jsp and my new one (which is could copy-cut/paste'd from the existing one). Main Action takes place in [js/script.js|https://github.com/steffkes/solr-admin/blob/master/js/script.js] which is actually neither clean nor pretty .. just work-in-progress. Actually it's Work in Progress, so ... give it a try. It's developed with Firefox as Browser, so, for a first impression .. please don't use _things_ like Internet Explorer or so ;o Jan already suggested a bunch of good things, i'm sure there are more ideas over there :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2381) The included jetty server does not support UTF-8
[ https://issues.apache.org/jira/browse/SOLR-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-2381: -- Attachment: SOLR-2381_xmltest.patch attached is a unit test. if you disable the 'case 4' so that it only uses 1, 2, and 3 byte codepoints, the test always passes. additionally it only fails with the XML response format (the default binary is fine). the test chooses different formats for each iteration. {noformat} junit-sequential: [junit] Testsuite: org.apache.solr.client.solrj.embedded.SolrExampleJettyTest [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 3.829 sec [junit] [junit] - Standard Error - [junit] NOTE: reproduce with: ant test -Dtestcase=SolrExampleJettyTest -Dtestmethod=testUnicode -Dtests.seed=-8507816048970822444:1424998400651628841 [junit] WARNING: test class left thread running: Thread[MultiThreadedHttpConnectionManager cleanup,5,main] [junit] RESOURCE LEAK: test class left 1 thread(s) running [junit] NOTE: test params are: codec=PreFlex, locale=es_GT, timezone=Asia/Hovd [junit] NOTE: all tests run in this JVM: [junit] [SolrExampleJettyTest] [junit] NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_23 (32-bit)/cpus=4,threads=2,free=9760576,total=16252928 [junit] - --- [junit] Testcase: testUnicode(org.apache.solr.client.solrj.embedded.SolrExampleJettyTest): Caused an ERROR [junit] Error executing query [junit] org.apache.solr.client.solrj.SolrServerException: Error executing query [junit] at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95) [junit] at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:119) [junit] at org.apache.solr.client.solrj.SolrExampleTests.testUnicode(SolrExampleTests.java:290) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1213) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145) [junit] Caused by: org.apache.solr.common.SolrException: parsing error [junit] at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:145) [junit] at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:106) [junit] at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:478) [junit] at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:245) [junit] at org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89) [junit] Caused by: com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 character 0xdf05(a surrogate character) at char #2475, byte #127) [junit] at com.ctc.wstx.sr.StreamScanner.throwFromIOE(StreamScanner.java:708) [junit] at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1086) [junit] at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:218) [junit] at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:244) [junit] at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:130) [junit] Caused by: java.io.CharConversionException: Invalid UTF-8 character 0xdf05(a surrogate character) at char #2475, byte #127) [junit] at com.ctc.wstx.io.UTF8Reader.reportInvalid(UTF8Reader.java:335) [junit] at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:247) [junit] at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) [junit] at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) [junit] at com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) [junit] at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992) [junit] at com.ctc.wstx.sr.StreamScanner.getNext(StreamScanner.java:763) [junit] at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2721) [junit] at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) [junit] [junit] {noformat} The included jetty server does not support UTF-8 Key: SOLR-2381 URL: https://issues.apache.org/jira/browse/SOLR-2381 Project: Solr Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.1, 4.0 Attachments: SOLR-2381.patch, SOLR-2381_xmltest.patch, SOLR-ServletOutputWriter.patch, jetty-6.1.26-patched-JETTY-1340.jar, jetty-util-6.1.26-patched-JETTY-1340.jar Some
[jira] Commented: (LUCENE-2834) don't spawn thread statically in FSDirectory on Mac OS X
[ https://issues.apache.org/jira/browse/LUCENE-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004179#comment-13004179 ] Michael McCandless commented on LUCENE-2834: I just hit this (got spooky leftover thread warning running test on OS X). I think we should fix it. I like the initial approach: let's not use MessageDigest at all (import our own MD5, that does not spawn threads!!). Sure it's code dup but it's tiny and it mitigates risk so I think it's well worth it. In general Lucene should not use interesting (risky) parts of the JVM/Java if we can avoid it w/o much cost, and this is a really silly reason to be using MessageDigest (similar to our now-gone crazy usage of ManagementFactory just to acquire a test lock). We are a search library! We must use the bare minimum of the OS/filesystem/JVM that we need. In fact in this case... can't we nuke DIGESTER altogether? Lucene now stores lock files in the index dir by default as write.lock. We only need this digest if you change that dir. So, if your app somehow wants to put the lock file elsewhere (unusual), it should be up to you to name it uniquely relative to other IWs storing locks in the same dir (we can do this under a separate issue). And not using SecureRandom to create temp files is a no-brainer -- the builtin File.createTempFile must be secure, by design, but we obviously don't need that here. I've had awful problems in the past w/ SecureRandom (because my machine didn't have enough true randomness!). Again: Lucene should only use what's we really need. I think we can remove the controversial interrupt the weird OS X PKCS11 thread from the patch since serialization is now gone? I'd like to know if this thread suddenly pops up again in our tests... and I agree it's dangerous to interrupt this thread (it could then cause weird failures in subsequent tests, eg if the thread doesn't restart). bq. One thing: I don't like the empty catch blocks /* cannot happen */. Even if this is the case, please throw at least a RuntimException +1 -- I like this idea (I don't do it now but I'll try to going forward). Defensive programming... don't spawn thread statically in FSDirectory on Mac OS X Key: LUCENE-2834 URL: https://issues.apache.org/jira/browse/LUCENE-2834 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Fix For: 4.0 Attachments: LUCENE-2834.patch, LUCENE-2834.patch on the Mac, creating the digester starts up a PKCS11 thread. I do not think threads should be created statically (I have this same issue with TimeLimitedCollector and also FilterManager). Uwe discussed removing this md5 digester, I don't care if we remove it or not, just as long as it doesn't create a thread, and just as long as it doesn't use the system default locale. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004184#comment-13004184 ] Michael McCandless commented on LUCENE-2573: bq. so you mean we always flush ALL DWPT once we reached the low watermark? No, I mean: as soon as we pass low wm, pick biggest DWPT and flush it. As soon as you mark that DWPT as flushPending, its RAM used is removed from active pool and added to flushPending pool. Then, if the active pool again crosses low wm, pick the biggest and mark as flush pending, etc. But if the flushing cannot keep up, and the sum of active + flushPending pools crosses high wm, you hijack (stall) incoming threads. I think this may make a good flush by RAM policy, but I agree we should test. I think the fully tiered approach may be overly complex... bq. for now this is internal only so even if we decide to I would shift that to a different issue. OK sounds good. Also, if the app really cares about this (I suspect none will) they could make a custom FlushPolicy that they could directly query to find out when threads get stalled. Besides this, is it only getting flushing of deletes working correctly that remains, before landing RT? Tiered flushing of DWPTs by RAM with low/high water marks - Key: LUCENE-2573 URL: https://issues.apache.org/jira/browse/LUCENE-2573 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Simon Willnauer Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across all DWPTs. A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach: - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) - Flush all DWPTs at a high water mark (e.g. at 110%) - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are used, flush at 90%, 95%, 100%, 105% and 110%. Should we allow the user to configure the low and high water mark values explicitly using total values (e.g. low water mark at 120MB, high water mark at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2950) Modules under top-level modules/ directory should be included in lucene's build targets, e.g. 'package-tgz', 'package-tgz-src', and 'javadocs'
[ https://issues.apache.org/jira/browse/LUCENE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004188#comment-13004188 ] Steven Rowe commented on LUCENE-2950: - bq. ideally we could remove these dependencies though. How would this work? E.g. many contribs depend on the common-analyzers module. Removing this dependency would almost certainly make the contribs non-functional. Maybe you mean, we should move contribs with {{modules/}} dependencies into {{modules/}}? Modules under top-level modules/ directory should be included in lucene's build targets, e.g. 'package-tgz', 'package-tgz-src', and 'javadocs' -- Key: LUCENE-2950 URL: https://issues.apache.org/jira/browse/LUCENE-2950 Project: Lucene - Java Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Steven Rowe Priority: Blocker Fix For: 4.0 Lucene's top level {{modules/}} directory is not included in the binary or source release distribution Ant targets {{package-tgz}} and {{package-tgz-src}}, or in {{javadocs}}, in {{lucene/build.xml}}. (However, these targets do include Lucene contribs.) This issue is visible via the nightly Jenkins (formerly Hudson) job named Lucene-trunk, which publishes binary and source artifacts, using {{package-tgz}} and {{package-tgz-src}}, as well as javadocs using the {{javadocs}} target, all run from the top-level {{lucene/}} directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004194#comment-13004194 ] Simon Willnauer commented on LUCENE-2573: - {quote} I think this may make a good flush by RAM policy, but I agree we should test. I think the fully tiered approach may be overly complex... {quote} yeah possibly, I think simplifying this is easy now though... {quote} Also, if the app really cares about this (I suspect none will) they could make a custom FlushPolicy that they could directly query to find out when threads get stalled. {quote} yeah I think we don't need to expose that through IW. {quote} Besides this, is it only getting flushing of deletes working correctly that remains, before landing RT? {quote} we need to fix LUCENE-2881 first too. Tiered flushing of DWPTs by RAM with low/high water marks - Key: LUCENE-2573 URL: https://issues.apache.org/jira/browse/LUCENE-2573 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Simon Willnauer Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across all DWPTs. A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach: - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) - Flush all DWPTs at a high water mark (e.g. at 110%) - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are used, flush at 90%, 95%, 100%, 105% and 110%. Should we allow the user to configure the low and high water mark values explicitly using total values (e.g. low water mark at 120MB, high water mark at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2950) Modules under top-level modules/ directory should be included in lucene's build targets, e.g. 'package-tgz', 'package-tgz-src', and 'javadocs'
[ https://issues.apache.org/jira/browse/LUCENE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004205#comment-13004205 ] Robert Muir commented on LUCENE-2950: - bq. How would this work? E.g. many contribs depend on the common-analyzers module. Removing this dependency would almost certainly make the contribs non-functional. The dependency is mostly bogus. Here is the contribs in question: * ant * demo * lucli * misc * spellchecker * swing * wordnet For example the ant IndexTask only depends on this so it can make this hashmap: {noformat} static { analyzerLookup.put(simple, SimpleAnalyzer.class.getName()); analyzerLookup.put(standard, StandardAnalyzer.class.getName()); analyzerLookup.put(stop, StopAnalyzer.class.getName()); analyzerLookup.put(whitespace, WhitespaceAnalyzer.class.getName()); } {noformat} I think we could remove this, e.g. it already has reflection code to build the analyzer, if you supply Xyz why not just look for XyzAnalyzer as a fallback? The lucli code has 'StandardAnalyzer' as a default: I think its best to not have a default analyzer at all. I would have fixed this already: but this contrib module has no tests! This makes it hard to want to get in there and clean up. The misc code mostly supplies an Analyzer inside embedded tools that don't actually analyze anything. We could add a pkg-private NullAnalyzer that throws UOE on its tokenStream() -- especially as they shouldnt be analyzing anything, so its reasonable to do? The spellchecker code has a hardcoded WhitespaceAnalyzer... why is this? Seems like the whole spellchecking n-gramming is wrong anyway. Spellchecker uses a special form of n-gramming that depends upon the word length. Currently it does this in java code and indexes with WhitespaceAnalyzer (creating a lot of garbage in the process, e.g. lots of Field objects), but it seems this could all be cleaned up so that the spellchecker uses its own SpellCheckNgramAnalyzer, for better performance to boot. The swing code defaults to a whitespaceanalyzer... in my opinion again its best to not have a default analyzer and make the user somehow specify one. The wordnet code uses StandardAnalyzer for indexing the wordnet database. It also includes a very limited SynonymTokenFilter. In my opinion, now that we merged the SynonymTokenizer from solr that supports multi-word synonyms etc (which this wordnet module DOES NOT!), we should nuke this whole thing. Instead, we should make the synonym-loading process more flexible, so that one can produce the SynonymMap from various formats (such as the existing Solr format, a relational database, wordnet's format, or openoffice thesaurus format, among others). We could have parsers for these various formats. This would allow us to have a much more powerful synonym capability, that works nicely regardless of format. We could then look at other improvements, such as allowing SynonymFilter to use a more ram-conscious datastructure for its Synonym mappings (e.g. FST), and everyone would see the benefits. So hopefully this entire contrib could be deprecated. Modules under top-level modules/ directory should be included in lucene's build targets, e.g. 'package-tgz', 'package-tgz-src', and 'javadocs' -- Key: LUCENE-2950 URL: https://issues.apache.org/jira/browse/LUCENE-2950 Project: Lucene - Java Issue Type: Bug Components: Build Affects Versions: 4.0 Reporter: Steven Rowe Priority: Blocker Fix For: 4.0 Lucene's top level {{modules/}} directory is not included in the binary or source release distribution Ant targets {{package-tgz}} and {{package-tgz-src}}, or in {{javadocs}}, in {{lucene/build.xml}}. (However, these targets do include Lucene contribs.) This issue is visible via the nightly Jenkins (formerly Hudson) job named Lucene-trunk, which publishes binary and source artifacts, using {{package-tgz}} and {{package-tgz-src}}, as well as javadocs using the {{javadocs}} target, all run from the top-level {{lucene/}} directory. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2399) Solr Admin Interface, reworked
[ https://issues.apache.org/jira/browse/SOLR-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004221#comment-13004221 ] Jan Høydahl commented on SOLR-2399: --- Cool! I tested. Got a box for environment, but no color. I copied your code into admin/ in my solr.war - but it seems that your intention is for it to live in /solr-admin for the moment, is that right? I see that you store the state in the URL, but when reloading page, the left menu is not expanded. Solr Admin Interface, reworked -- Key: SOLR-2399 URL: https://issues.apache.org/jira/browse/SOLR-2399 Project: Solr Issue Type: Improvement Components: web gui Reporter: Stefan Matheis (steffkes) Priority: Minor *The idea was to create a new, fresh (and hopefully clean) Solr Admin Interface.* [Based on this [ML-Thread|http://www.lucidimagination.com/search/document/ae35e236d29d225e/solr_admin_interface_reworked_go_on_go_away]] I've quickly created a Github-Repository (Just for me, to keep track of the changes) » https://github.com/steffkes/solr-admin [This commit shows the differences|https://github.com/steffkes/solr-admin/commit/5f80bb0ea9deb4b94162632912fe63386f869e0d] between old/existing index.jsp and my new one (which is could copy-cut/paste'd from the existing one). Main Action takes place in [js/script.js|https://github.com/steffkes/solr-admin/blob/master/js/script.js] which is actually neither clean nor pretty .. just work-in-progress. Actually it's Work in Progress, so ... give it a try. It's developed with Firefox as Browser, so, for a first impression .. please don't use _things_ like Internet Explorer or so ;o Jan already suggested a bunch of good things, i'm sure there are more ideas over there :) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1566) Allow components to add fields to outgoing documents
[ https://issues.apache.org/jira/browse/SOLR-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004227#comment-13004227 ] Ryan McKinley commented on SOLR-1566: - I just added a path that *almost* works. Rather then bang my head on it some more, i think some feedback would be great. Originally, I hoped to clean up the ResponeWriter mess, and make a single place that would 'augment' docs -- but there are *so* many micro optimizations for each format that approach seems difficult. Instead I went for an approach the moves SetString returnFields to a class that knows more then just a list of strings. This holds an augmenter that can add to SolrDocument or write to the response. Now you can make a request like: http://localhost:8983/solr/select?q=*:*fl=id,score,_docid_,_shard and get back a response {code} response:{numFound:17,start:0,maxScore:1.0,docs:[ { id:GB18030TEST, score:1.0, _docid_:0, _shard_:getshardid???}, { id:SP2514N, score:1.0, _docid_:1, _shard_:getshardid???}, { id:6H500F0, features:[ SATA 3.0Gb/s, NCQ, 8.5ms seek, 16MB cache], score:1.0, _docid_:2, _shard_:getshardid???}, ... {code} right now, _docid_ just returns the lucene docid, and _shard_ returns a constant string saying getshardid??? The distributed tests (BasicDistributedZkTest, and TestDistributedSearch) don't pass, but everything else does. I will wait for general feedback before trying to track that down. Also looking at SOLR-1298, I would love some feedback on how we could read function queries and ideally used ones that are already defined elsewhere. feedback welcome! Allow components to add fields to outgoing documents Key: SOLR-1566 URL: https://issues.apache.org/jira/browse/SOLR-1566 Project: Solr Issue Type: New Feature Components: search Reporter: Noble Paul Assignee: Grant Ingersoll Fix For: Next Attachments: SOLR-1566-gsi.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch Currently it is not possible for components to add fields to outgoing documents which are not in the the stored fields of the document. This makes it cumbersome to add computed fields/metadata . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2411) Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it
[ https://issues.apache.org/jira/browse/SOLR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2411: -- Attachment: SOLR-2411.patch This patch puts the results of targets {{package}}, {{package-src}} and {{generate-maven-artifacts}} in a new directory named {{solr/package/}}. The {{clean}} target removes this new directory, and it is added to the {{svn:ignore}} list for the {{solr/}} directory. The {{clean-dist-signatures}} target is renamed to {{clean-package-signatures}}. Like the previous patch, this patch drops creation of {{solr/dist/solr-maven.tar}} and {{solr/dist/solr.tar}} from the {{preprare-release}} target. Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it - Key: SOLR-2411 URL: https://issues.apache.org/jira/browse/SOLR-2411 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: SOLR-2411.patch, SOLR-2411.patch Build targets dist, dist-*, create-package, package, package-src, etc. use {{dist/}} as a landing spot for intermediate .jar files which will not be individually shipped. These targets should instead use {{solr/build/}} to hold these intermediate .jars. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-1566) Allow components to add fields to outgoing documents
[ https://issues.apache.org/jira/browse/SOLR-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004240#comment-13004240 ] Yonik Seeley commented on SOLR-1566: Hey Ryan, I've also been working on this recently, trying to keep all of the various use-cases in mind (it's difficult!). It's all just been about brainstorming on the back-end (the chain of things that can modify documents), but I also have code that parses multiple fl params, including field globbing and function queries. It doesn't *implement* those yet, just allows for their specification. Allow components to add fields to outgoing documents Key: SOLR-1566 URL: https://issues.apache.org/jira/browse/SOLR-1566 Project: Solr Issue Type: New Feature Components: search Reporter: Noble Paul Assignee: Grant Ingersoll Fix For: Next Attachments: SOLR-1566-gsi.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch Currently it is not possible for components to add fields to outgoing documents which are not in the the stored fields of the document. This makes it cumbersome to add computed fields/metadata . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5711 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5711/ No tests ran. Build Log (for compile errors): [...truncated 100 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-3.x - Build # 284 - Failure
Build: https://hudson.apache.org/hudson/job/Solr-3.x/284/ No tests ran. Build Log (for compile errors): [...truncated 5718 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #48: POMs out of sync
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/48/ No tests ran. Build Log (for compile errors): [...truncated 5736 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5696 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5696/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5712 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5712/ No tests ran. Build Log (for compile errors): [...truncated 36 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5697 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5697/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5713 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5713/ No tests ran. Build Log (for compile errors): [...truncated 36 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-3.x - Build # 304 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/304/ No tests ran. Build Log (for compile errors): [...truncated 5717 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5698 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5698/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2408) JSONWriter NPE if data in stored Document does not match field type info in schema
[ https://issues.apache.org/jira/browse/SOLR-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-2408: --- Description: JSONWriter will produce an NullPointerException if an int field stored, and then the schema.xml is updated to change the type. In general, Solr should try to generate better error messages in situations like this. was: SEVERE: java.lang.NullPointerException at org.apache.solr.request.JSONWriter.writeStr(JSONResponseWriter.java:613) at org.apache.solr.schema.TextField.write(TextField.java:49) at org.apache.solr.schema.SchemaField.write(SchemaField.java:113) at org.apache.solr.request.JSONWriter.writeDoc(JSONResponseWriter.java:383) at org.apache.solr.request.JSONWriter.writeDoc(JSONResponseWriter.java:449) at org.apache.solr.request.JSONWriter.writeDocList(JSONResponseWriter.java:496) at org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:141) at org.apache.solr.request.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:179) at org.apache.solr.request.JSONWriter.writeNamedList(JSONResponseWriter.java:294) at org.apache.solr.request.JSONWriter.writeResponse(JSONResponseWriter.java:92) at org.apache.solr.request.JSONResponseWriter.write(JSONResponseWriter.java:51) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:325) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) To reproduce the exception, the following ingredients are required: - A query that is sorted by a date field in DESC order, ASC will not crash - A multivalued field like so field name=deleted_in type=text indexed=true stored=true multiValued=true / - Exclude the multivalued field from either the query (AND NOT deleted_in:XXX) or with a filter query (-deleted_in:XXX) - The value XXX of deleted_in MUST MATCH at least one value, if not, the crash will not occur. Summary: JSONWriter NPE if data in stored Document does not match field type info in schema (was: Solr 1.4.1 javaNullPointerException) clarifying summary and description, original description was... {quote} SEVERE: java.lang.NullPointerException at org.apache.solr.request.JSONWriter.writeStr(JSONResponseWriter.java:613) at org.apache.solr.schema.TextField.write(TextField.java:49) at org.apache.solr.schema.SchemaField.write(SchemaField.java:113) at org.apache.solr.request.JSONWriter.writeDoc(JSONResponseWriter.java:383) at org.apache.solr.request.JSONWriter.writeDoc(JSONResponseWriter.java:449) at org.apache.solr.request.JSONWriter.writeDocList(JSONResponseWriter.java:496) at org.apache.solr.request.TextResponseWriter.writeVal(TextResponseWriter.java:141) at org.apache.solr.request.JSONWriter.writeNamedListAsMapWithDups(JSONResponseWriter.java:179) at org.apache.solr.request.JSONWriter.writeNamedList(JSONResponseWriter.java:294) at org.apache.solr.request.JSONWriter.writeResponse(JSONResponseWriter.java:92) at org.apache.solr.request.JSONResponseWriter.write(JSONResponseWriter.java:51) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:325) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254) at
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5714 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5714/ No tests ran. Build Log (for compile errors): [...truncated 36 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5699 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5699/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004291#comment-13004291 ] Michael Busch commented on LUCENE-2573: --- bq. we need to fix LUCENE-2881 first too. Yeah, I haven't merged with trunk since we rolled back 2881, so we should fix it first, catch up with trunk, and then make deletes work. I might have a bit time tonight to work on 2881. Tiered flushing of DWPTs by RAM with low/high water marks - Key: LUCENE-2573 URL: https://issues.apache.org/jira/browse/LUCENE-2573 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Simon Willnauer Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across all DWPTs. A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach: - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) - Flush all DWPTs at a high water mark (e.g. at 110%) - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are used, flush at 90%, 95%, 100%, 105% and 110%. Should we allow the user to configure the low and high water mark values explicitly using total values (e.g. low water mark at 120MB, high water mark at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5715 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5715/ No tests ran. Build Log (for compile errors): [...truncated 36 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5700 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5700/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5716 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5716/ 2 tests failed. FAILED: org.apache.solr.update.DirectUpdateHandlerOptimizeTest.testWatchChildren Error Message: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. Stack Trace: junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. at java.lang.Thread.run(Thread.java:679) FAILED: TEST-org.apache.solr.cloud.ZkSolrClientTest.xml.init Error Message: Stack Trace: Test report file /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build/test-results/TEST-org.apache.solr.cloud.ZkSolrClientTest.xml was length 0 Build Log (for compile errors): [...truncated 8462 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 5702 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/5702/ 1 tests failed. REGRESSION: org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2894) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589) at java.lang.StringBuffer.append(StringBuffer.java:337) at java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617) at org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93) at org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304) at org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1075) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1007) Build Log (for compile errors): [...truncated 4631 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-3.x #51: POMs out of sync
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-3.x/51/ No tests ran. Build Log (for compile errors): [...truncated 28 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #49: POMs out of sync
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/49/ No tests ran. Build Log (for compile errors): [...truncated 28 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-trunk - Build # 1489 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1489/ No tests ran. Build Log (for compile errors): [...truncated 107 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-3.x - Build # 285 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-3.x/285/ No tests ran. Build Log (for compile errors): [...truncated 70 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-trunk - Build # 1434 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-trunk/1434/ No tests ran. Build Log (for compile errors): [...truncated 68 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5717 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5717/ No tests ran. Build Log (for compile errors): [...truncated 34 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 5700 - Still Failing
Problem seems solved now, we can build the JAR files with a very old Java 1.5 on Lucene build machine. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Wednesday, March 09, 2011 1:32 AM To: dev@lucene.apache.org Subject: RE: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 5700 - Still Failing Hi, The Lucene FreeBSD slave was upgraded to latest FreeBSD. Unfortunately all ports installed by me were lost after this upgrade. I fixed it by installing openjdk6 (like before), but we are still missing a working JDK 1.5 to test our compilation (this is why all builds failed since a few hours). For now I simply linked the directory of JDK 1.5 to openjdk6, so the builds at least work. But very important: we cannot check if @Override on interfaces is used or if some methods/classes unavailable in JDK 1.5 are used. So: Before you commit anything to Lucene 3.x or Lucene trunk or Solr 3.x, please run a JDK 1.5 build first. Sorry, this cannot be done automated at the moment, I am working on it! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Apache Hudson Server [mailto:hud...@hudson.apache.org] Sent: Wednesday, March 09, 2011 1:10 AM To: dev@lucene.apache.org Subject: [HUDSON] Lucene-Solr-tests-only-3.x - Build # 5700 - Still Failing Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only- 3.x/5700/ No tests ran. Build Log (for compile errors): [...truncated 37 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-3.x - Build # 306 - Failure
Build: https://hudson.apache.org/hudson/job/Lucene-3.x/306/ No tests ran. Build Log (for compile errors): [...truncated 13829 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-3.x - Build # 286 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-3.x/286/ No tests ran. Build Log (for compile errors): [...truncated 11956 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Solr-trunk - Build # 1435 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-trunk/1435/ No tests ran. Build Log (for compile errors): [...truncated 11518 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-trunk - Build # 1490 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-trunk/1490/ No tests ran. Build Log (for compile errors): [...truncated 5696 lines...] + ANT_HOME=/home/hudson/tools/ant/latest1.7 + SVNVERSION_EXE=svnversion + SVN_EXE=svn + CLOVER=/home/hudson/tools/clover/clover2latest + JAVA_HOME_15=/home/hudson/tools/java/latest1.5 + JAVA_HOME_16=/home/hudson/tools/java/latest1.6 + TESTS_MULTIPLIER=3 + TEST_LINE_DOCS_FILE=/home/hudson/lucene-data/enwiki.random.lines.txt.gz + ROOT_DIR=checkout + CORE_DIR=checkout/lucene + MODULES_DIR=checkout/modules + SOLR_DIR=checkout/solr + ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-trunk/artifacts + JAVADOCS_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-trunk/javadocs + set +x Checking for files containing nocommit (exits build with failure if list is non-empty): + VERSION=4.0-2011-03-09_02-23-46 + mkdir -p /home/hudson/hudson-slave/workspace/Lucene-trunk/artifacts + mkdir -p /home/hudson/hudson-slave/workspace/Lucene-trunk/javadocs + cd /home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant -Dversion=4.0-2011-03-09_02-23-46 -Dsvnversion.exe=svnversion -Dsvn.exe=svn clean package-tgz-src package-tgz Buildfile: build.xml clean: init: init-dist: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/dist [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/dist/maven package-tgz-src: [tar] Building tar: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/lucene-4.0-2011-03-09_02-23-46-src.tar [gzip] Building: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/dist/lucene-4.0-2011-03-09_02-23-46-src.tar.gz [echo] Building checksums for '/usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/dist/lucene-4.0-2011-03-09_02-23-46-src.tar.gz' jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac]
[jira] Commented: (SOLR-2346) Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly.
[ https://issues.apache.org/jira/browse/SOLR-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004339#comment-13004339 ] Koji Sekiguchi commented on SOLR-2346: -- I've faced the same problem. I'm trying to index a Shift_JIS encoded text file through the following request: http://localhost:8983/solr/update/extract?literal.id=docA001stream.file=/foo/bar/sjis.txtcommit=truestream.contentType=text%2Fplain%3B+charset%3DShift_JIS But Tika's AutoDetectParser doesn't regard Solr's charset (or Solr doesn't set the content type to Tika Parser; I should dig in). I looked into ExtractingDocumentLoader.java and it seemed that I could select an appropriate parser if I use stream.type parameter: {code:title=ExtractingDocumentLoader.java} public void load(SolrQueryRequest req, SolrQueryResponse rsp, ContentStream stream) throws IOException { errHeader = ExtractingDocumentLoader: + stream.getSourceInfo(); Parser parser = null; String streamType = req.getParams().get(ExtractingParams.STREAM_TYPE, null); if (streamType != null) { //Cache? Parsers are lightweight to construct and thread-safe, so I'm told MediaType mt = MediaType.parse(streamType.trim().toLowerCase()); parser = config.getParser(mt); } else { parser = autoDetectParser; } : } {code} The request was: http://localhost:8983/solr/update/extract?literal.id=docA001stream.file=/foo/bar/sjis.txtcommit=truestream.contentType=text%2Fplain%3B+charset%3DShift_JISstream.type=text%2Fplain I could select TXTParser rather than AutoDetectParser, but the problem wasn't solved. And I looked at Tika Javadoc for TXTParser and it said The text encoding of the document stream is automatically detected based on the byte patterns found at the beginning of the stream. The input metadata key HttpHeaders.CONTENT_ENCODING is used as an encoding hint if the automatic encoding detection fails.: http://tika.apache.org/0.8/api/org/apache/tika/parser/txt/TXTParser.html So I tried to insert the following hard coded fix: {code:title=ExtractingDocumentLoader.java} Metadata metadata = new Metadata(); metadata.add(ExtractingMetadataConstants.STREAM_NAME, stream.getName()); metadata.add(ExtractingMetadataConstants.STREAM_SOURCE_INFO, stream.getSourceInfo()); metadata.add(ExtractingMetadataConstants.STREAM_SIZE, String.valueOf(stream.getSize())); metadata.add(ExtractingMetadataConstants.STREAM_CONTENT_TYPE, stream.getContentType()); metadata.add(HttpHeaders.CONTENT_ENCODING, Shift_JIS); // = temporary fix {code} and the problem was gone (anymore garbled characters indexed). Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly. --- Key: SOLR-2346 URL: https://issues.apache.org/jira/browse/SOLR-2346 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1 Environment: Solr 1.4.1, Packaged Jetty as servlet container, Windows XP SP1, Machine was booted in Japanese Locale. Reporter: Prasad Deshpande Priority: Critical Attachments: NormalSave.msg, UnicodeSave.msg, sample_jap_UTF-8.txt, sample_jap_non_UTF-8.txt I am able to successfully index/search non-Engilsh files (like Hebrew, Japanese) which was encoded in UTF-8. However, When I tried to index data which was encoded in local encoding like Big5 for Japanese I could not see the desired results. The contents after indexing looked garbled for Big5 encoded document when I searched for all indexed documents. When I index attached non utf-8 file it indexes in following way - result name=response numFound=1 start=0 - doc - arr name=attr_content str�� ��/str /arr - arr name=attr_content_encoding strBig5/str /arr - arr name=attr_content_language strzh/str /arr - arr name=attr_language strzh/str /arr - arr name=attr_stream_size str17/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc /result /response Here you said it index file in UTF8 however it seems that non UTF8 file gets indexed in Big5 encoding. Here I tried fetching indexed data stream in Big5 and converted in UTF8. String id = (String) resulDocument.getFirstValue(attr_content); byte[] bytearray = id.getBytes(Big5); String utf8String = new String(bytearray, UTF-8); It does not gives expected results. When I index UTF-8 file it indexes like following - doc - arr name=attr_content strマイ ネットワーク/str /arr - arr name=attr_content_encoding strUTF-8/str /arr - arr name=attr_stream_content_type strtext/plain/str /arr - arr name=attr_stream_name strsample_jap_unicode.txt/str /arr - arr
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #50: POMs out of sync
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-Maven-trunk/50/ No tests ran. Build Log (for compile errors): [...truncated 5699 lines...] + JAVA_HOME_15=/home/hudson/tools/java/latest1.5 + JAVA_HOME_16=/home/hudson/tools/java/latest1.6 + TESTS_MULTIPLIER=3 + TEST_LINE_DOCS_FILE=/home/hudson/lucene-data/enwiki.random.lines.txt.gz + ROOT_DIR=checkout + CORE_DIR=checkout/lucene + MODULES_DIR=checkout/modules + SOLR_DIR=checkout/solr + ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/artifacts + JAVADOCS_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/javadocs + set +x Checking for files containing nocommit (exits build with failure if list is non-empty): + MAVEN_ARTIFACT_VERSION=4.0-SNAPSHOT + SOLR_JAVA_HOME=/home/hudson/tools/java/latest1.6 + echo ' Ignoring BasicDistributedZkTest, which always fails on Hudson trunk under Maven' Ignoring BasicDistributedZkTest, which always fails on Hudson trunk under Maven + perl -pi.bak -e 's/(?=public class BasicDistributedZkTest)/import org.junit.Ignore;\n\@Ignore\n/;' checkout/solr/src/test/org/apache/solr/cloud/BasicDistributedZkTest.java + . /home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/nightly/common-maven.sh + MAVEN_ARTIFACTS=/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/maven_artifacts + export M2_HOME=/home/hudson/tools/maven/latest2.2 + echo ' Removing Lucene/Solr artifacts from ~/.m2/repository/' Removing Lucene/Solr artifacts from ~/.m2/repository/ + rm -rf /home/hudson/.m2/repository/org/apache/lucene + rm -rf /home/hudson/.m2/repository/org/apache/solr + echo ' Done removing Lucene/Solr artifacts from ~/.m2/repository/' Done removing Lucene/Solr artifacts from ~/.m2/repository/ + [ -d /home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene ] + [ lucene = solr ] + ARTIFACTS_JAVA_HOME=/home/hudson/tools/java/latest1.5 + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene + echo ' Generating the Maven snapshot artifacts in lucene/' Generating the Maven snapshot artifacts in lucene/ + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant -Dsvnversion.exe=svnversion -Dsvn.exe=svn -Dversion=4.0-SNAPSHOT generate-maven-artifacts Buildfile: build.xml maven.ant.tasks-check: jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ?
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5718 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5718/ No tests ran. Build Log (for compile errors): [...truncated 54 lines...] clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/phonetic [echo] Building analyzers-smartcn... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/smartcn [echo] Building analyzers-stempel... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/analysis/build/stempel [echo] Building benchmark... clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/modules/benchmark/build clean-contrib: clean: clean: clean: clean: clean: clean: [delete] Deleting directory /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/solr/build BUILD SUCCESSFUL Total time: 4 seconds + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac]
[jira] Commented: (SOLR-2346) Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly.
[ https://issues.apache.org/jira/browse/SOLR-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004346#comment-13004346 ] Koji Sekiguchi commented on SOLR-2346: -- By looking at Tika, HtmlParser and TXTParser see HttpHeaders.CONTENT_ENCODING value in metadata. I think Solr sould set it in metadata if the charset value is presented by user. Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly. --- Key: SOLR-2346 URL: https://issues.apache.org/jira/browse/SOLR-2346 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1 Environment: Solr 1.4.1, Packaged Jetty as servlet container, Windows XP SP1, Machine was booted in Japanese Locale. Reporter: Prasad Deshpande Priority: Critical Attachments: NormalSave.msg, UnicodeSave.msg, sample_jap_UTF-8.txt, sample_jap_non_UTF-8.txt I am able to successfully index/search non-Engilsh files (like Hebrew, Japanese) which was encoded in UTF-8. However, When I tried to index data which was encoded in local encoding like Big5 for Japanese I could not see the desired results. The contents after indexing looked garbled for Big5 encoded document when I searched for all indexed documents. When I index attached non utf-8 file it indexes in following way - result name=response numFound=1 start=0 - doc - arr name=attr_content str�� ��/str /arr - arr name=attr_content_encoding strBig5/str /arr - arr name=attr_content_language strzh/str /arr - arr name=attr_language strzh/str /arr - arr name=attr_stream_size str17/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc /result /response Here you said it index file in UTF8 however it seems that non UTF8 file gets indexed in Big5 encoding. Here I tried fetching indexed data stream in Big5 and converted in UTF8. String id = (String) resulDocument.getFirstValue(attr_content); byte[] bytearray = id.getBytes(Big5); String utf8String = new String(bytearray, UTF-8); It does not gives expected results. When I index UTF-8 file it indexes like following - doc - arr name=attr_content strマイ ネットワーク/str /arr - arr name=attr_content_encoding strUTF-8/str /arr - arr name=attr_stream_content_type strtext/plain/str /arr - arr name=attr_stream_name strsample_jap_unicode.txt/str /arr - arr name=attr_stream_size str28/str /arr - arr name=attr_stream_source_info strmyfile/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc So, I can index and search UTF-8 data. For more reference below is the discussion with Yonik. Please find attached TXT file which I was using to index and search. curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=truecharset=utf-8; -F myfile=@sample_jap_non_UTF-8 One problem is that you are giving big5 encoded text to Solr and saying that it's UTF8. Here's one way to actually tell solr what the encoding of the text you are sending is: curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=true; --data-binary @sample_jap_non_UTF-8.txt -H 'Content-type:text/plain; charset=big5' Now the problem appears that for some reason, this doesn't work... Could you open a JIRA issue and attach your two test files? -Yonik http://lucidimagination.com -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5719 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5719/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 6 seconds + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[jira] Commented: (SOLR-1566) Allow components to add fields to outgoing documents
[ https://issues.apache.org/jira/browse/SOLR-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004354#comment-13004354 ] Bill Bell commented on SOLR-1566: - Here is one use case: - Return geodist() as a field in the results. Allow components to add fields to outgoing documents Key: SOLR-1566 URL: https://issues.apache.org/jira/browse/SOLR-1566 Project: Solr Issue Type: New Feature Components: search Reporter: Noble Paul Assignee: Grant Ingersoll Fix For: Next Attachments: SOLR-1566-gsi.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566-rm.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch, SOLR-1566.patch Currently it is not possible for components to add fields to outgoing documents which are not in the the stored fields of the document. This makes it cumbersome to add computed fields/metadata . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5720 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5720/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 2 seconds + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[jira] Updated: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)
[ https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Mattozzi updated SOLR-1752: Attachment: SOLR-1752.patch I ran into this problem today and was surprised it hadn't been fixed. I've attached a patch to UpdateRequest that maintains an ordered list that can be a mix of SolrInputDocuments to add, ids to delete, and delete queries. There's a few places where my patch iterates over documents instead of doing an addAll so there may be some inefficiencies. It seems like these would be outweighed by the ability to group up update operations, but I could always optimize more. SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer) - Key: SOLR-1752 URL: https://issues.apache.org/jira/browse/SOLR-1752 Project: Solr Issue Type: Bug Components: clients - java, update Affects Versions: 1.4 Reporter: Jayson Minard Assignee: Shalin Shekhar Mangar Priority: Blocker Attachments: SOLR-1752.patch Add this test to SolrExampleTests.java and it will fail when using the XML Request Writer (now default), but not if you change the SolrExampleJettyTest to use the BinaryRequestWriter. {code} public void testAddDeleteInSameRequest() throws Exception { SolrServer server = getSolrServer(); SolrInputDocument doc3 = new SolrInputDocument(); doc3.addField( id, id3, 1.0f ); doc3.addField( name, doc3, 1.0f ); doc3.addField( price, 10 ); UpdateRequest up = new UpdateRequest(); up.add( doc3 ); up.deleteById(id001); up.setWaitFlush(false); up.setWaitSearcher(false); up.process( server ); } {code} terminates with exception: {code} Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461) at com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) ... 18 more {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
[jira] Assigned: (SOLR-2346) Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly.
[ https://issues.apache.org/jira/browse/SOLR-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned SOLR-2346: Assignee: Koji Sekiguchi Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly. --- Key: SOLR-2346 URL: https://issues.apache.org/jira/browse/SOLR-2346 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1 Environment: Solr 1.4.1, Packaged Jetty as servlet container, Windows XP SP1, Machine was booted in Japanese Locale. Reporter: Prasad Deshpande Assignee: Koji Sekiguchi Priority: Critical Attachments: NormalSave.msg, SOLR-2346.patch, UnicodeSave.msg, sample_jap_UTF-8.txt, sample_jap_non_UTF-8.txt I am able to successfully index/search non-Engilsh files (like Hebrew, Japanese) which was encoded in UTF-8. However, When I tried to index data which was encoded in local encoding like Big5 for Japanese I could not see the desired results. The contents after indexing looked garbled for Big5 encoded document when I searched for all indexed documents. When I index attached non utf-8 file it indexes in following way - result name=response numFound=1 start=0 - doc - arr name=attr_content str�� ��/str /arr - arr name=attr_content_encoding strBig5/str /arr - arr name=attr_content_language strzh/str /arr - arr name=attr_language strzh/str /arr - arr name=attr_stream_size str17/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc /result /response Here you said it index file in UTF8 however it seems that non UTF8 file gets indexed in Big5 encoding. Here I tried fetching indexed data stream in Big5 and converted in UTF8. String id = (String) resulDocument.getFirstValue(attr_content); byte[] bytearray = id.getBytes(Big5); String utf8String = new String(bytearray, UTF-8); It does not gives expected results. When I index UTF-8 file it indexes like following - doc - arr name=attr_content strマイ ネットワーク/str /arr - arr name=attr_content_encoding strUTF-8/str /arr - arr name=attr_stream_content_type strtext/plain/str /arr - arr name=attr_stream_name strsample_jap_unicode.txt/str /arr - arr name=attr_stream_size str28/str /arr - arr name=attr_stream_source_info strmyfile/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc So, I can index and search UTF-8 data. For more reference below is the discussion with Yonik. Please find attached TXT file which I was using to index and search. curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=truecharset=utf-8; -F myfile=@sample_jap_non_UTF-8 One problem is that you are giving big5 encoded text to Solr and saying that it's UTF8. Here's one way to actually tell solr what the encoding of the text you are sending is: curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=true; --data-binary @sample_jap_non_UTF-8.txt -H 'Content-type:text/plain; charset=big5' Now the problem appears that for some reason, this doesn't work... Could you open a JIRA issue and attach your two test files? -Yonik http://lucidimagination.com -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2346) Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly.
[ https://issues.apache.org/jira/browse/SOLR-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated SOLR-2346: - Affects Version/s: 4.0 3.1 Fix Version/s: 4.0 3.2 Non UTF-8 Text files having other than english texts(Japanese/Hebrew) are no getting indexed correctly. --- Key: SOLR-2346 URL: https://issues.apache.org/jira/browse/SOLR-2346 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 1.4.1, 3.1, 4.0 Environment: Solr 1.4.1, Packaged Jetty as servlet container, Windows XP SP1, Machine was booted in Japanese Locale. Reporter: Prasad Deshpande Assignee: Koji Sekiguchi Priority: Critical Fix For: 3.2, 4.0 Attachments: NormalSave.msg, SOLR-2346.patch, UnicodeSave.msg, sample_jap_UTF-8.txt, sample_jap_non_UTF-8.txt I am able to successfully index/search non-Engilsh files (like Hebrew, Japanese) which was encoded in UTF-8. However, When I tried to index data which was encoded in local encoding like Big5 for Japanese I could not see the desired results. The contents after indexing looked garbled for Big5 encoded document when I searched for all indexed documents. When I index attached non utf-8 file it indexes in following way - result name=response numFound=1 start=0 - doc - arr name=attr_content str�� ��/str /arr - arr name=attr_content_encoding strBig5/str /arr - arr name=attr_content_language strzh/str /arr - arr name=attr_language strzh/str /arr - arr name=attr_stream_size str17/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc /result /response Here you said it index file in UTF8 however it seems that non UTF8 file gets indexed in Big5 encoding. Here I tried fetching indexed data stream in Big5 and converted in UTF8. String id = (String) resulDocument.getFirstValue(attr_content); byte[] bytearray = id.getBytes(Big5); String utf8String = new String(bytearray, UTF-8); It does not gives expected results. When I index UTF-8 file it indexes like following - doc - arr name=attr_content strマイ ネットワーク/str /arr - arr name=attr_content_encoding strUTF-8/str /arr - arr name=attr_stream_content_type strtext/plain/str /arr - arr name=attr_stream_name strsample_jap_unicode.txt/str /arr - arr name=attr_stream_size str28/str /arr - arr name=attr_stream_source_info strmyfile/str /arr - arr name=content_type strtext/plain/str /arr str name=iddoc2/str /doc So, I can index and search UTF-8 data. For more reference below is the discussion with Yonik. Please find attached TXT file which I was using to index and search. curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=truecharset=utf-8; -F myfile=@sample_jap_non_UTF-8 One problem is that you are giving big5 encoded text to Solr and saying that it's UTF8. Here's one way to actually tell solr what the encoding of the text you are sending is: curl http://localhost:8983/solr/update/extract?literal.id=doc1uprefix=attr_fmap.content=attr_contentfmap.div=foo_tboost.foo_t=3commit=true; --data-binary @sample_jap_non_UTF-8.txt -H 'Content-type:text/plain; charset=big5' Now the problem appears that for some reason, this doesn't work... Could you open a JIRA issue and attach your two test files? -Yonik http://lucidimagination.com -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5721 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5721/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 2 seconds + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[jira] Updated: (SOLR-1752) SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer)
[ https://issues.apache.org/jira/browse/SOLR-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mike Mattozzi updated SOLR-1752: Attachment: SOLR-1752_2.patch Here's a second version of my previous patch that keeps Collections of documents together instead of wrapping each one in an add document operation. Should save a bit on object creation and iteration compared to the previous patch I attached. SolrJ fails with exception when passing document ADD and DELETEs in the same request using XML request writer (but not binary request writer) - Key: SOLR-1752 URL: https://issues.apache.org/jira/browse/SOLR-1752 Project: Solr Issue Type: Bug Components: clients - java, update Affects Versions: 1.4 Reporter: Jayson Minard Assignee: Shalin Shekhar Mangar Priority: Blocker Attachments: SOLR-1752.patch, SOLR-1752_2.patch Add this test to SolrExampleTests.java and it will fail when using the XML Request Writer (now default), but not if you change the SolrExampleJettyTest to use the BinaryRequestWriter. {code} public void testAddDeleteInSameRequest() throws Exception { SolrServer server = getSolrServer(); SolrInputDocument doc3 = new SolrInputDocument(); doc3.addField( id, id3, 1.0f ); doc3.addField( name, doc3, 1.0f ); doc3.addField( price, 10 ); UpdateRequest up = new UpdateRequest(); up.add( doc3 ); up.deleteById(id001); up.setWaitFlush(false); up.setWaitSearcher(false); up.process( server ); } {code} terminates with exception: {code} Feb 3, 2010 8:55:34 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:72) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal to have multiple roots (start tag in epilog?). at [row,col {unknown-source}]: [1,125] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:630) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:461) at com.ctc.wstx.sr.BasicStreamReader.handleExtraRoot(BasicStreamReader.java:2155) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2070) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2647) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:90) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) ... 18 more {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5722 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5722/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[HUDSON] Solr-3.x - Build # 287 - Still Failing
Build: https://hudson.apache.org/hudson/job/Solr-3.x/287/ No tests ran. Build Log (for compile errors): [...truncated 11956 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004379#comment-13004379 ] Bill Bell commented on SOLR-2337: - I found a quick fix that will output # of hits. See attached patch, Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Attachments: SOLR.2377.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell resolved SOLR-2337. - Resolution: Fixed Fix Version/s: 4.0 Ready for committer Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Fix For: 4.0 Attachments: SOLR.2377.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Reopened: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell reopened SOLR-2337: - Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Fix For: 4.0 Attachments: SOLR.2377.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
IndexWriter#setRAMBufferSizeMB removed in trunk
Heya, I think the ability to change the RAMBufferSizeMB on a live IndexWriter (without the need to close and open it) is an important one, and it seems like tis deprecated on 3.1 and removed in trunk. Is there a chance to get it back? -shay.banon
[jira] Updated: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2337: Attachment: (was: SOLR.2377.patch) Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Fix For: 4.0 Attachments: SOLR.2337.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2337: Attachment: SOLR.2337.patch Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Fix For: 4.0 Attachments: SOLR.2337.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2337) Solr needs hits= added to the log when using grouping
[ https://issues.apache.org/jira/browse/SOLR-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bill Bell updated SOLR-2337: Comment: was deleted (was: Ready for committer) Solr needs hits= added to the log when using grouping -- Key: SOLR-2337 URL: https://issues.apache.org/jira/browse/SOLR-2337 Project: Solr Issue Type: Bug Components: SearchComponents - other Affects Versions: 4.0 Reporter: Bill Bell Fix For: 4.0 Attachments: SOLR.2337.patch We monitor the Solr logs to try to review queries that have hits=0. This enables us to improve relevancy since they are easy to find and review. When using group=true, hits= does not show up: {code} 2011-01-27 01:10:16,117 INFO core.SolrCore - [collection1] webapp= path=/select params={group=truegroup.field=gendergroup.field=idq=*:*} status=0 QTime=15 {code} The code in QueryComponent.java needs to add the matches() after calling grouping.execute() and add up the total. It does return hits= in the log for mainResult, but not for standard grouping. This should be easy to add since matches are defined... -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5724 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5724/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5725 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5725/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5726 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5726/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
Re: GSoC
Hey David and all others who want to contribute to GSoC, the ASF has applied for GSoC 2011 as a mentoring organization. As a ASF project we don't need to apply directly though but we need to register our ideas now. This works like almost anything in the ASF through JIRA. All ideas should be recorded as JIRA tickets labeled with gsoc2011. Once this is done it will show up here: http://s.apache.org/gsoc2011tasks Everybody who is interested in GSoC as a mentor or student should now read this too http://community.apache.org/gsoc.html Thanks, Simon On Thu, Feb 24, 2011 at 12:14 PM, David Nemeskey nemeskey.da...@sztaki.hu wrote: Please find the implementation plan attached. The word soon gets a new meaning when power outages are taken into account. :) As before, comments are welcome. David On Tuesday, February 22, 2011 15:22:57 Simon Willnauer wrote: I think that is good for now. I should get started on codeawards and wrap up our proposals. I hope I can do that this week. simon On Tue, Feb 22, 2011 at 3:16 PM, David Nemeskey nemeskey.da...@sztaki.hu wrote: Hey, I have written the proposal. Please let me know if you want more / less of certain parts. Should I upload it somewhere? Implementation plan soon to follow. Sorry for the late reply; I have been rather busy these past few weeks. David On Wednesday, February 02, 2011 10:35:55 Simon Willnauer wrote: Hey David, I saw that you added a tiny line to the GSoC Lucene wiki - thanks for that. On Wed, Feb 2, 2011 at 10:10 AM, David Nemeskey nemeskey.da...@sztaki.hu wrote: Hi guys, Mark, Robert, Simon: thanks for the support! I really hope we can work together this summer (and before that, obviously). Same here! According to http://www.google- melange.com/document/show/gsoc_program/google/gsoc2011/timeline , there's still some time until the application period. So let me use this week to finish my PhD research plan, and get back to you next week. I am not really familiar with how the program works, i.e. how detailed the application description should be, when mentorship is decided, etc. so I guess we will have a lot to talk about. :) so from a 1ft view it work like this: 1. Write up a short proposal what your idea is about 2. make it public! and publish a implementation plan - how you would want to realize your proposal. If you don't follow that 100% in the actual impl. don't worry. Its just mean to give us an idea that you know what you are doing and where you want to go. something like a 1 A4 rough design doc. 3. give other people the change to apply for the same suggestion (this is how it works though) 4 Let the ASF / us assign one or more possible mentors to it 5. let us apply for a slot in GSoC (those are limited for organizations) 6. get accepted 7. rock it! (Actually, should we move this discussion private?) no - we usually do everything in public except of discussion within the PMC that are meant to be private for legal reasons or similar things. Lets stick to the mailing list for all communication except you have something that should clearly not be public. This also give other contributors a chance to help and get interested in your work!! simon David Hi David, honestly this sounds fantastic. It would be great to have someone to work with us on this issue! To date, progress is pretty slow-going (minor improvements, cleanups, additional stats here and there)... but we really need all the help we can get, especially from people who have a really good understanding of the various models. In case you are interested, here are some references to discussions about adding more flexibility (with some prototypes etc): http://www.lucidimagination.com/search/document/72787e0e54f798e4/baby _st eps _towards_making_lucene_s_scoring_more_flexible https://issues.apache.org/jira/browse/LUCENE-2392 On Fri, Jan 28, 2011 at 11:32 AM, David Nemeskey nemeskey.da...@sztaki.hu wrote: Hi all, I have already sent this mail to Simon Willnauer, and he suggested me to post it here for discussion. I am David Nemeskey, a PhD student at the Eotvos Lorand University, Budapest, Hungary. I am doing an IR-related research, and we have considered using Lucene as our search engine. We were quite satisfied with the speed and ease of use. However, we would like to experiment with different ranking algorithms, and this is where problems arise. Lucene only supports the VSM, and unfortunately the ranking architecture seems to be tailored specifically to its needs. I would be very much interested in revamping the ranking component as a GSoC project. The following modifications should be doable in the allocated time frame: - a new ranking class hierarchy, which is generic enough to allow easy
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5727 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5727/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[HUDSON] Lucene-Solr-tests-only-trunk - Build # 5728 - Still Failing
Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5728/ No tests ran. Build Log (for compile errors): [...truncated 43 lines...] clean: [echo] Building analyzers-icu... clean: [echo] Building analyzers-phonetic... clean: [echo] Building analyzers-smartcn... clean: [echo] Building analyzers-stempel... clean: [echo] Building benchmark... clean: clean-contrib: clean: clean: clean: clean: clean: clean: BUILD SUCCESSFUL Total time: 1 second + cd /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene + JAVA_HOME=/home/hudson/tools/java/latest1.5 /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib Buildfile: build.xml jflex-uptodate-check: jflex-notice: javacc-uptodate-check: javacc-notice: init: clover.setup: clover.info: [echo] [echo] Clover not found. Code coverage reports disabled. [echo] clover: common.compile-core: [mkdir] Created dir: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] Compiling 508 source files to /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/build/classes/java [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80: warning: [dep-ann] deprecated name isnt annotated with @Deprecated [javac] public boolean onOrAfter(Version other) { [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:415: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.IntValuesCreator [javac] return new FieldComparator.IntComparator(numHits, (IntValuesCreator)creator, (Integer)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:418: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.FloatValuesCreator [javac] return new FieldComparator.FloatComparator(numHits, (FloatValuesCreator)creator, (Float)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:421: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.LongValuesCreator [javac] return new FieldComparator.LongComparator(numHits, (LongValuesCreator)creator, (Long)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:424: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.DoubleValuesCreator [javac] return new FieldComparator.DoubleComparator(numHits, (DoubleValuesCreator)creator, (Double)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:427: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ByteValuesCreator [javac] return new FieldComparator.ByteComparator(numHits, (ByteValuesCreator)creator, (Byte)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/search/SortField.java:430: inconvertible types [javac] found : org.apache.lucene.search.cache.CachedArrayCreatorcapture of ? [javac] required: org.apache.lucene.search.cache.ShortValuesCreator [javac] return new FieldComparator.ShortComparator(numHits, (ShortValuesCreator)creator, (Short)missingValue ); [javac] ^ [javac] /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-trunk/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34: warning: [dep-ann]
[jira] Commented: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003852#comment-13003852 ] Dawid Weiss commented on LUCENE-2953: - There seems to be no consensus on how to deal with generic arrays. Even the JDK has two different implementations -- one in ArrayDeque (uses T[]), the other in ArrayList (uses Object[]). Creating an array of a given component type is (can be?) more costly than keeping an array Object[] because it needs to be done via call to Array.newArray (haven't checked though). Theoretically having a concrete-type array should speed up iterators (because no additional casts are needed), but I don't think this is the case. In fact, I just wrote a simple Caliper benchmark that compares these (attached), my results show the runtime times is nearly identical (probably within stddev).: {noformat} 0% Scenario{vm=java, trial=0, benchmark=Generic, size=100} 8985430.93 ns; σ=257329.28 ns @ 10 trials 33% Scenario{vm=java, trial=0, benchmark=GenericSubclass, size=100} 8989486.27 ns; σ=207151.20 ns @ 10 trials 67% Scenario{vm=java, trial=0, benchmark=Object, size=100} 8767324.34 ns; σ=218235.97 ns @ 10 trials benchmark ms linear runtime Generic 8.99 = GenericSubclass 8.99 == Object 8.77 = vm: java trial: 0 size: 100 {noformat} PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated LUCENE-2953: Attachment: BenchmarkArrayAccess.java A Google Caliper benchmark comparing iteration over a class with a generic T[] (real T[] type), its concrete-type subclass and a class using Object[] and (T) casts for accessing array elements. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2948) Make var gap terms index a partial prefix trie
[ https://issues.apache.org/jira/browse/LUCENE-2948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003858#comment-13003858 ] Dawid Weiss commented on LUCENE-2948: - Nice visualization! Phew! So this means for any given suffix you'd know how many terms end with it? But how would we then use this info to better pick index nodes? Such an inverted array would allow you to precalculate which FSA states have a dense subtree and which don't and freeze them even before you'd start constructing the full FSA. But I agree that even without trying this seems like a costly solution. With respect to the speed differences table above -- I believe there is going to be no perfect method for all kinds of queries, so it's a tradeoff. What one could do is try to determine the distribution (bushiness factor of trienfication? :) of input terms and then build separate automata for specific kinds of queries... But then they'd consume more RAM, so another tradeoff kicks in. Make var gap terms index a partial prefix trie -- Key: LUCENE-2948 URL: https://issues.apache.org/jira/browse/LUCENE-2948 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-2948.patch, LUCENE-2948.patch, LUCENE-2948.patch, LUCENE-2948_automaton.patch, Results.png Var gap stores (in an FST) the indexed terms (every 32nd term, by default), minus their non-distinguishing suffixes. However, often times the resulting FST is close to a prefix trie in some portion of the terms space. By allowing some nodes of the FST to store all outgoing edges, including ones that do not lead to an indexed term, and by recording that this node is then authoritative as to what terms exist in the terms dict from that prefix, we can get some important benefits: * It becomes possible to know that a certain term prefix cannot exist in the terms index, which means we can save a disk seek in some cases (like PK lookup, docFreq, etc.) * We can query for the next possible prefix in the index, allowing some MTQs (eg FuzzyQuery) to save disk seeks. Basically, the terms index is able to answer questions that previously required seeking/scanning in the terms dict file. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene and Solr 3.1 release candidate
I found what seems to be a glitch in StopFilter's ctors -- the boolean 'enablePosInc' was removed from the ctors and users now have to use the setter instead. However, the ctors do default to 'true' if the passed in Version is onOrAfter(29). All of FilteringTokenFilter sub-classes include the enablePosIncr in their ctors, including FilteringTF itself. Therefore I assume the parameter was mistakenly dropped from StopFilter's ctors. Also, the @deprecated text doesn't mention how should I enable/disable it, and reading the source code doesn't help either, since the setter/getter are in FilteringTF. Also, LengthFilter has a deprecated ctor, but the class was added on Nov 16 and I don't see it in 3.0.3. So perhaps we can remove that ctor (and add a @since tag to the class)? I don't know if these two warrant a new RC but I think they are important to fix. Shai On Mon, Mar 7, 2011 at 5:52 PM, Smiley, David W. dsmi...@mitre.org wrote: So https://issues.apache.org/jira/browse/SOLR-2405 didn't make it in yesterday (apparently it didn't)? :-( Darn... maybe I shouldn't have waited for a committer to agree with the issue. I would have had it in Saturday. ~ David Smiley On Mar 7, 2011, at 1:32 AM, Robert Muir wrote: Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2410) ConcurrentLRUCache can throw class cast exception
[ https://issues.apache.org/jira/browse/SOLR-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003843#comment-13003843 ] Dawid Weiss commented on SOLR-2410: --- Yep, sorry for belated follow up, I went to sleep last night ;) The workaround is to access T[] as an Object[] -- the actual type of the object that is stored in that field. I like reverting to Object[] much more since this feature is quite nasty and puzzling, especially for people who haven't seen it before (I have seen it a few times, for example here: http://issues.carrot2.org/browse/HPPC-46). Glad I could help. ConcurrentLRUCache can throw class cast exception - Key: SOLR-2410 URL: https://issues.apache.org/jira/browse/SOLR-2410 Project: Solr Issue Type: Bug Affects Versions: 4.0 Reporter: Yonik Seeley Fix For: 4.0 Attachments: SOLR-2410.patch, SOLR-2410.patch ConcurrentLRUCache throws a class cast exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003868#comment-13003868 ] Uwe Schindler commented on LUCENE-2953: --- The easy and most simply way to handle this is osing Object[] like in ArrayList. The problem with then always casting from Object to T is thousands of unchecked warnings in PriorityQueue. I would propose the following: In general the final T[] heap variable should be private to the PQ and used only there. For performance yonik wanted the heap[] protected and that caused the issue. As long as the heap[] array is private it can never be accessed incorrectly. So my proposal is to internally use the T[] as a private field and simply use another Object[] thats protected (pointing to the same array). This would fix the problem. The most correct idea would be to add a setHeapSlot(int, T o) and T getHeapSlot(int) method and hiding the T[] heap completely, but I know, Yonik will disagree :-) There is some other problem: the heap array should be final, but it cannot, because of the stupid initialize() method. I would like to remove this method and simply move the code to PQ's ctor. I don't understand why the initialize() method is there, which is a problem: Every guide on Java programming tells you to never call protected overrideable methods from ctors, as this can break easily. If the heap[] is final, the problem of having two references to the same object is not a problem anymore. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003873#comment-13003873 ] Dawid Weiss commented on LUCENE-2953: - The problem with then always casting from Object to T is thousands of unchecked warnings in PriorityQueue. You could erase the type in internal methods of PriorityQueue and use Object instead of T then. So my proposal is to internally use the T[] as a private field and simply use another Object[] thats protected (pointing to the same array). Or a protected getter method that would do the cast (why bother with having two fields): {noformat} protected Object[] getStorageArray() { return (Object[]) heap; } {noformat} If Yonik wants access to that array I'm sure he copies it to a local var. prior to doing any intensive loops... PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003876#comment-13003876 ] Uwe Schindler commented on LUCENE-2953: --- bq. Or a protected getter method that would do the cast (why bother with having two fields) Good idea, I am currently fixing the whole stuff (I was the one who added the generics in Lucene 3.0). But I am now also removing initialize(int), this construct is very broken. In trunk we can break backwards for this. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003878#comment-13003878 ] Dawid Weiss commented on LUCENE-2953: - I remember using pq at some point for other things and hating that initialize method, so I'm all for it. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 5709 - Failure
This seed doesn't repro the failure for me: ant test -Dtestcase=TestIndexWriter -Dtestmethod=testIndexingThenDeleting -Dtests.seed=2772841086465723649:-8475474922759781208 -Dtests.multiplier=3 Can anyone repro in their env? Mike On Mon, Mar 7, 2011 at 8:33 PM, Apache Hudson Server hud...@hudson.apache.org wrote: Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/5709/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting Error Message: flush happened too quickly during deleting count=1155 Stack Trace: junit.framework.AssertionFailedError: flush happened too quickly during deleting count=1155 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1213) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145) at org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting(TestIndexWriter.java:2579) Build Log (for compile errors): [...truncated 3082 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003879#comment-13003879 ] Michael McCandless commented on LUCENE-2573: Woops -- I screwed up the above test! Corrected numbers: trunk takes 439 sec to index 10M docs, 202 sec to waitForMerges, 17 sec to commit. RT + patch: 289 sec to index 10M docs, 225 sec to waitForMerges, 26 sec to commit. This is w/ 8 threads (machine has 24 cores), writing to Intel X25M G2 ssd. Awesome speedup!! Nice work everyone ;) Looking forward to making a new blog post soon! Tiered flushing of DWPTs by RAM with low/high water marks - Key: LUCENE-2573 URL: https://issues.apache.org/jira/browse/LUCENE-2573 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Simon Willnauer Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across all DWPTs. A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach: - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) - Flush all DWPTs at a high water mark (e.g. at 110%) - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are used, flush at 90%, 95%, 100%, 105% and 110%. Should we allow the user to configure the low and high water mark values explicitly using total values (e.g. low water mark at 120MB, high water mark at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2953: -- Attachment: LUCENE-2953.patch Here my patch, that also removes initialize(int), which is bad design. For 3.x we can simply leave out this change and only make the heap variable private and expose as Object[] using a getter method. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Issue Comment Edited: (LUCENE-2953) PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object
[ https://issues.apache.org/jira/browse/LUCENE-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003882#comment-13003882 ] Uwe Schindler edited comment on LUCENE-2953 at 3/8/11 9:58 AM: --- bq. I remember using pq at some point for other things and hating that initialize method, so I'm all for it. The inititalize method was only there for one single reason in Lucene (a hack): The getSentinelObject() method was used in HitQueue like that: it should return null for some special corner case. To enable this special case, a boolean field was used. But the ctor had to populate that field before the prepoulating in super() was done, and thats impossible. I changed that by adding a boolean ctor to the PQ base class to enable/disable pre-populating like HitQueue did before. was (Author: thetaphi): bq. I remember using pq at some point for other things and hating that initialize method, so I'm all for it. The inititalize method was only there for one single reason in Lucene (a hack). The getSentinelObject() method was used in HitQueue in a very special way: it should return null for some special case. To enable this special case, a boolean field was used. But the ctor had to populate that field before the prepoulating was done, and thats impossible. I changed that by adding a boolean ctor to the PQ base class to enable/disable pre-populating like HitQueue did before. PriorityQueue is inheriently broken if subclass attempts to use heap w/generic T bound to anything other then Object Key: LUCENE-2953 URL: https://issues.apache.org/jira/browse/LUCENE-2953 Project: Lucene - Java Issue Type: Bug Reporter: Hoss Man Attachments: BenchmarkArrayAccess.java, LUCENE-2953.patch, LUCENE-2953.patch as discovered in SOLR-2410 the fact that the protected heap variable in PriorityQueue is initialized using an Object[] makes it impossible for subclasses of PriorityQueue to exist and access the heap array unless they bind the generic to Object. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene and Solr 3.1 release candidate
Hello, the lucene-solr-grandparent pom [1] file mentions a jetty version 6.1.26-patched-JETTY-1340 which is not available in the repositories where I would expect it. Do I need to enable some additional repository? This seems related to SOLR-2381. I think for people using Solr as their dependency via Maven, this is a blocker; of course not everyone uses it so I've no strong opinions about this, but thought to let you know. Personally I'd depend on the released version of jetty, and document that this bug is not fixed until Jetty version XY is released; in alternative, I'd add keep the pom as is but instructions and warnings in the release notes would be very welcome. (I couldn't find a Chances.html for Solr?) Regards, Sanne [1] http://people.apache.org/~rmuir/staging_area/lucene-solr-3.1RC0-rev1078688/lucene-3.1RC0/maven/org/apache/lucene/lucene-solr-grandparent/3.1.0/lucene-solr-grandparent-3.1.0.pom 2011/3/8 Shai Erera ser...@gmail.com: I found what seems to be a glitch in StopFilter's ctors -- the boolean 'enablePosInc' was removed from the ctors and users now have to use the setter instead. However, the ctors do default to 'true' if the passed in Version is onOrAfter(29). All of FilteringTokenFilter sub-classes include the enablePosIncr in their ctors, including FilteringTF itself. Therefore I assume the parameter was mistakenly dropped from StopFilter's ctors. Also, the @deprecated text doesn't mention how should I enable/disable it, and reading the source code doesn't help either, since the setter/getter are in FilteringTF. Also, LengthFilter has a deprecated ctor, but the class was added on Nov 16 and I don't see it in 3.0.3. So perhaps we can remove that ctor (and add a @since tag to the class)? I don't know if these two warrant a new RC but I think they are important to fix. Shai On Mon, Mar 7, 2011 at 5:52 PM, Smiley, David W. dsmi...@mitre.org wrote: So https://issues.apache.org/jira/browse/SOLR-2405 didn't make it in yesterday (apparently it didn't)? :-( Darn... maybe I shouldn't have waited for a committer to agree with the issue. I would have had it in Saturday. ~ David Smiley On Mar 7, 2011, at 1:32 AM, Robert Muir wrote: Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2573) Tiered flushing of DWPTs by RAM with low/high water marks
[ https://issues.apache.org/jira/browse/LUCENE-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003909#comment-13003909 ] Michael McCandless commented on LUCENE-2573: Not done reviewing the patch but here's some initial feedback: Very cool (and super advanced) that this adds a FlushPolicy! But for normal usage we go and make either DocCountFP or TieredFP, depending on whether IWC is flushing by docCount, RAM or both right? Ie one normally need not make their own FlushPolicy. Maybe rename TieredFP - ByRAMFP? Also, I'm not sure we need the N tiers? I suspect that may flush too heavily? Can we instead simplify it and have only the low and high water marks? So we flush when active RAM is over low water mark? (And we stall if active + flushing RAM exceeds high water mark). Can we rename isHealthy to isStalled (ie, invert it)? I'm still unsure we should even include any healthy check APIs. This is an exceptional situation and I don't think we need API exposure for it? If apps really want to, they can turn on infoStream (we should make sure stalling is logged, just like it is for merging) and debug from there? Maybe rename pendingBytes to flushingBytes? Or maybe flushPendingBytes? (Just to make it clear what we are pending on...). Maybe rename FP.printInfo(String msg) -- FP.message? (Consistent w/ our other classes). I wonder if FP.findFlushes should be renamed to something like FP.visit, and return void? Ie, it's called for its side effects of marking DWPTs for flushing, right? Separately, whether or not this thread will go and flush a DWPT is for IW to decide? (Like it could be this thread didn't mark any new flush required, but it should go off and pull a DWPT previously marked by another thread). So then IW would have a private volatile boolean recording whether any active DWPTs have flushPending. Tiered flushing of DWPTs by RAM with low/high water marks - Key: LUCENE-2573 URL: https://issues.apache.org/jira/browse/LUCENE-2573 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Michael Busch Assignee: Simon Willnauer Priority: Minor Fix For: Realtime Branch Attachments: LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch, LUCENE-2573.patch Now that we have DocumentsWriterPerThreads we need to track total consumed RAM across all DWPTs. A flushing strategy idea that was discussed in LUCENE-2324 was to use a tiered approach: - Flush the first DWPT at a low water mark (e.g. at 90% of allowed RAM) - Flush all DWPTs at a high water mark (e.g. at 110%) - Use linear steps in between high and low watermark: E.g. when 5 DWPTs are used, flush at 90%, 95%, 100%, 105% and 110%. Should we allow the user to configure the low and high water mark values explicitly using total values (e.g. low water mark at 120MB, high water mark at 140MB)? Or shall we keep for simplicity the single setRAMBufferSizeMB() config method and use something like 90% and 110% for the water marks? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene and Solr 3.1 release candidate
I found another problem. Maybe something changed in the logic of FSDirectory.open(), but now when I run some tests over my code, I see that if MMapDirectory is chosen and an attempt to seek to an incorrect position is made, IllegalArgumentException is thrown, instead of IOE. This breaks my code which catches IOE and handle it. While I can modify my code to also catch IAE, I wonder if it's ok that MMapDir throws such an exception instead of IOE. It's actually thrown from ByteBuffer, however the caller, which holds a reference to a Directory, does not know if the underlying impl is MMapDir, LinuxFSDir etc. Should we fix it on the new branch and go for a second RC? Shai On Tue, Mar 8, 2011 at 9:19 AM, Shai Erera ser...@gmail.com wrote: I found what seems to be a glitch in StopFilter's ctors -- the boolean 'enablePosInc' was removed from the ctors and users now have to use the setter instead. However, the ctors do default to 'true' if the passed in Version is onOrAfter(29). All of FilteringTokenFilter sub-classes include the enablePosIncr in their ctors, including FilteringTF itself. Therefore I assume the parameter was mistakenly dropped from StopFilter's ctors. Also, the @deprecated text doesn't mention how should I enable/disable it, and reading the source code doesn't help either, since the setter/getter are in FilteringTF. Also, LengthFilter has a deprecated ctor, but the class was added on Nov 16 and I don't see it in 3.0.3. So perhaps we can remove that ctor (and add a @since tag to the class)? I don't know if these two warrant a new RC but I think they are important to fix. Shai On Mon, Mar 7, 2011 at 5:52 PM, Smiley, David W. dsmi...@mitre.orgwrote: So https://issues.apache.org/jira/browse/SOLR-2405 didn't make it in yesterday (apparently it didn't)? :-( Darn... maybe I shouldn't have waited for a committer to agree with the issue. I would have had it in Saturday. ~ David Smiley On Mar 7, 2011, at 1:32 AM, Robert Muir wrote: Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene and Solr 3.1 release candidate
On Tue, Mar 8, 2011 at 8:11 AM, Shai Erera ser...@gmail.com wrote: I found another problem. Maybe something changed in the logic of FSDirectory.open(), but now when I run some tests over my code, I see that if MMapDirectory is chosen and an attempt to seek to an incorrect position is made, IllegalArgumentException is thrown, instead of IOE. This breaks my code which catches IOE and handle it. While I can modify my code to also catch IAE, I wonder if it's ok that MMapDir throws such an exception instead of IOE. It's actually thrown from ByteBuffer, however the caller, which holds a reference to a Directory, does not know if the underlying impl is MMapDir, LinuxFSDir etc. This isn't a bug: Its clearly documented in the release notes that FSDirectory.open()'s behavior has changed to return different implementations for some platforms. Additionally the documentation in FSDirectory states that different implementations have quirks and recommends instantiating the desired implementation directly. The behavior of what exceptions MMapDirectory throws has not changed: it throws the same exceptions it always did. If your code depends upon the exact exception classes or text I think you should instantiate the directory directly (and for the record, i think this stuff is still open-season to change regardless, as its internal). Nowhere does any javadocs claim that any specific runtime exception is thrown. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene and Solr 3.1 release candidate
Hi, I found a serious issue in CheckIndex.java (lines 357++): If you run CheckIndex on an index updated or changed with 3.1 it print the following: 2011-03-08 14:38:56,373 INFO org.apache.lucene.index.CheckIndex - Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] Too stupid. We should check all other version numbers printed in CheckIndex and fix accordingly. I know, Shaie added new versions in several other files, too. I don't think we can provide this to users, as it will cause lot's of JIRA issues complaining about that. Do we also need to fix the Solr' ConcurentLRUMap issue? Yonik? I provided a patch this morning. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, March 07, 2011 7:33 AM To: dev@lucene.apache.org Subject: [VOTE] Lucene and Solr 3.1 release candidate Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Updated: (SOLR-2411) Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it
[ https://issues.apache.org/jira/browse/SOLR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe updated SOLR-2411: -- Affects Version/s: 4.0 3.2 3.1 Fix Version/s: 4.0 3.2 3.1 Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it - Key: SOLR-2411 URL: https://issues.apache.org/jira/browse/SOLR-2411 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: SOLR-2411.patch Build targets dist, dist-*, create-package, package, package-src, etc. use {{dist/}} as a landing spot for intermediate .jar files which will not be individually shipped. These targets should instead use {{solr/build/}} to hold these intermediate .jars. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2411) Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it
[ https://issues.apache.org/jira/browse/SOLR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003944#comment-13003944 ] Robert Muir commented on SOLR-2411: --- +1 Its confusing that 'non-dist' stuff goes into the dist/ folder. This is how the .war file accidentally ended up in the release candidate. If we apply this patch, then the build stuff then works consistent with lucene's, and its easier for the release manager to correctly upload things, because its all in dist/ Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it - Key: SOLR-2411 URL: https://issues.apache.org/jira/browse/SOLR-2411 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: SOLR-2411.patch Build targets dist, dist-*, create-package, package, package-src, etc. use {{dist/}} as a landing spot for intermediate .jar files which will not be individually shipped. These targets should instead use {{solr/build/}} to hold these intermediate .jars. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene and Solr 3.1 release candidate
I opened https://issues.apache.org/jira/browse/LUCENE-2954 - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, March 08, 2011 2:43 PM To: dev@lucene.apache.org Subject: RE: [VOTE] Lucene and Solr 3.1 release candidate Hi, I found a serious issue in CheckIndex.java (lines 357++): If you run CheckIndex on an index updated or changed with 3.1 it print the following: 2011-03-08 14:38:56,373 INFO org.apache.lucene.index.CheckIndex - Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] Too stupid. We should check all other version numbers printed in CheckIndex and fix accordingly. I know, Shaie added new versions in several other files, too. I don't think we can provide this to users, as it will cause lot's of JIRA issues complaining about that. Do we also need to fix the Solr' ConcurentLRUMap issue? Yonik? I provided a patch this morning. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, March 07, 2011 7:33 AM To: dev@lucene.apache.org Subject: [VOTE] Lucene and Solr 3.1 release candidate Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 5709 - Failure
Mike, I can repro, but not consistently. In a while[1] script, after 12 iterations (I compiled with Java 5, then ran tests with Java 6, as on Jenkins): junit-sequential: [junit] RESOURCE LEAK: test method: 'testIndexingThenDeleting' left 1 thread(s) running [junit] NOTE: test params are: codec=RandomCodecProvider: {field=MockFixedIntBlock(blockSize=26)}, locale=th, timezone=US/East-Indiana [junit] NOTE: all tests run in this JVM: [junit] [TestIndexWriter] [junit] NOTE: Windows 7 6.1 amd64/Sun Microsystems Inc. 1.6.0_23 (64-bit)/cpus=8,threads=1,free=98650696,total=124321792 [junit] - --- [junit] Testcase: testIndexingThenDeleting(org.apache.lucene.index.TestIndexWriter):FAILED [junit] flush happened too quickly during deleting count=1155 [junit] junit.framework.AssertionFailedError: flush happened too quickly during deleting count=1155 [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1213) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1145) [junit] at org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting(TestIndexWriter.java:2579) -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Tuesday, March 08, 2011 4:46 AM To: dev@lucene.apache.org Cc: Apache Hudson Server Subject: Re: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 5709 - Failure This seed doesn't repro the failure for me: ant test -Dtestcase=TestIndexWriter -Dtestmethod=testIndexingThenDeleting -Dtests.seed=2772841086465723649:-8475474922759781208 -Dtests.multiplier=3 Can anyone repro in their env? Mike On Mon, Mar 7, 2011 at 8:33 PM, Apache Hudson Server hud...@hudson.apache.org wrote: Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only- trunk/5709/ 1 tests failed. REGRESSION: org.apache.lucene.index.TestIndexWriter.testIndexingThenDelet ing Error Message: flush happened too quickly during deleting count=1155 Stack Trace: junit.framework.AssertionFailedError: flush happened too quickly during deleting count=1155 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Lucene TestCase.java:1213) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Lucene TestCase.java:1145) at org.apache.lucene.index.TestIndexWriter.testIndexingThenDeleting(TestIndex Writer.java:2579) Build Log (for compile errors): [...truncated 3082 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Created: (LUCENE-2954) CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk)
CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk) - Key: LUCENE-2954 URL: https://issues.apache.org/jira/browse/LUCENE-2954 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.1 Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Blocker Fix For: 3.1, 3.2, 4.0 When you run CheckIndex on an index created/updated with 3.1, it prints about the SegmentInfos: {noformat} Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] {noformat} We should fix CheckIndex and also verify other cases where version numbers are printed out. In trunk the issue may be more complicated! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: BM 25 scoring with lucene
The LUCENE-2091.patch file from the jira entry is essentially what we are using. It should work fine. -- Avi 2011/3/2 Gérard Dupont ger.dup...@gmail.com Hi, On 2 March 2011 07:50, Lahiru Samarakoon lahir...@gmail.com wrote: Hi All, Do you have any BM 25 scoring implementation which can be used with Lucene? One query on goggle, first reqsult (for me) : http://nlp.uned.es/~jperezi/Lucene-BM25/ (But it's from 2009) How can I find and use the implementation mentioned in following jira entry? https://issues.apache.org/jira/browse/LUCENE-2091 Thanks, Lahiru -- Gérard Dupont Information Processing Control and Cognition (IPCC) CASSIDIAN - an EADS company Document Learning team - LITIS Laboratory
[jira] Updated: (LUCENE-2954) CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk)
[ https://issues.apache.org/jira/browse/LUCENE-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2954: -- Attachment: LUCENE-2954.patch New patch: - handle preliminary 3.1 version exactly like 2.9 did it - throw exception if somebody fails to upgrade this tool (will be hit in trunk when this patch is merged) CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk) - Key: LUCENE-2954 URL: https://issues.apache.org/jira/browse/LUCENE-2954 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.1 Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Blocker Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2954.patch, LUCENE-2954.patch When you run CheckIndex on an index created/updated with 3.1, it prints about the SegmentInfos: {noformat} Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] {noformat} We should fix CheckIndex and also verify other cases where version numbers are printed out. In trunk the issue may be more complicated! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] Lucene and Solr 3.1 release candidate
What about StopFilter (and LengthFilter) -- should we fix them before 3.1? Shai On Tue, Mar 8, 2011 at 4:05 PM, Uwe Schindler u...@thetaphi.de wrote: I opened https://issues.apache.org/jira/browse/LUCENE-2954 - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Tuesday, March 08, 2011 2:43 PM To: dev@lucene.apache.org Subject: RE: [VOTE] Lucene and Solr 3.1 release candidate Hi, I found a serious issue in CheckIndex.java (lines 357++): If you run CheckIndex on an index updated or changed with 3.1 it print the following: 2011-03-08 14:38:56,373 INFO org.apache.lucene.index.CheckIndex - Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] Too stupid. We should check all other version numbers printed in CheckIndex and fix accordingly. I know, Shaie added new versions in several other files, too. I don't think we can provide this to users, as it will cause lot's of JIRA issues complaining about that. Do we also need to fix the Solr' ConcurentLRUMap issue? Yonik? I provided a patch this morning. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Monday, March 07, 2011 7:33 AM To: dev@lucene.apache.org Subject: [VOTE] Lucene and Solr 3.1 release candidate Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (SOLR-2411) Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it
[ https://issues.apache.org/jira/browse/SOLR-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Rowe resolved SOLR-2411. --- Resolution: Fixed Assignee: Steven Rowe Build target prepare-release should produce a solr/dist/ directory that only has distribution files in it - Key: SOLR-2411 URL: https://issues.apache.org/jira/browse/SOLR-2411 Project: Solr Issue Type: Improvement Components: Build Affects Versions: 3.1, 3.2, 4.0 Reporter: Steven Rowe Assignee: Steven Rowe Priority: Minor Fix For: 3.1, 3.2, 4.0 Attachments: SOLR-2411.patch Build targets dist, dist-*, create-package, package, package-src, etc. use {{dist/}} as a landing spot for intermediate .jar files which will not be individually shipped. These targets should instead use {{solr/build/}} to hold these intermediate .jars. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2381) The included jetty server does not support UTF-8
[ https://issues.apache.org/jira/browse/SOLR-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003980#comment-13003980 ] Bernd Fehling commented on SOLR-2381: - Robert, unfortunately I wasn't able to build a reproducible test so I decided to debug it on my server. The bug is in Jetty and has been fixed with jetty-7.3.1.v20110307. Because I started debugging during weekend I used the older jetty.7.3.0 with the bug included, located the bug and recognized today that it had just been fixed in the new version from yesterday. Nevertheless here is the description because I went through all the bits and bytes. In jetty-7 there is jetty-server with org.eclipse.jetty.server.HttpWriter.java. That is the OutputWriter which extends Writer and does the UTF-8 encoding. The buffer comes of size 8192 bytes and is chunked and encoded with HttpWriter in sizes of 512 bytes. The encoding is that in java it is UTF-16 and is read as integer. If the code is above BMP ist has a surrogate which is read first and thereafter the next integer. Excample: 55349(dec) and 56320(dec) is converted to 119808(10) which is U+1D400 Remember that the buffer is of size 512 bytes. But what if the counter is at 510 and a Unicode above BMP comes up? The solution is to write the current buffer to output, reset it and start over with an empty buffer. And here is/was the bug. The surrogate reminder was cleared to early at a wrong place and got lost. If I find a svn with jetty-6.1.26 sources I will look into that one also. Otherwise use jetty-7.3.1-v20110307 that is fixed. May be we should setup a xml page for testing that has at least more than 512 characters of UTF-8 code above BMP in a row for testing? The included jetty server does not support UTF-8 Key: SOLR-2381 URL: https://issues.apache.org/jira/browse/SOLR-2381 Project: Solr Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.1, 4.0 Attachments: SOLR-2381.patch, SOLR-ServletOutputWriter.patch, jetty-6.1.26-patched-JETTY-1340.jar, jetty-util-6.1.26-patched-JETTY-1340.jar Some background here: http://www.lucidimagination.com/search/document/6babe83bd4a98b64/which_unicode_version_is_supported_with_lucene Some possible solutions: * wait and see if we get resolution on http://jira.codehaus.org/browse/JETTY-1340. To be honest, I am not even sure where jetty is being maintained (there is a separate jetty project at eclipse.org with another bugtracker, but the older releases are at codehaus). * include a patched version of jetty with correct utf-8, using that patch. * remove jetty and include a different container instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2954) CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk)
[ https://issues.apache.org/jira/browse/LUCENE-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003981#comment-13003981 ] Robert Muir commented on LUCENE-2954: - Thanks for adding the check to prevent this from ever biting us again... if someone bumps this version and doesnt properly update CheckIndex tests will fail, I like this! CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk) - Key: LUCENE-2954 URL: https://issues.apache.org/jira/browse/LUCENE-2954 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.1 Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Blocker Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2954.patch, LUCENE-2954.patch When you run CheckIndex on an index created/updated with 3.1, it prints about the SegmentInfos: {noformat} Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] {noformat} We should fix CheckIndex and also verify other cases where version numbers are printed out. In trunk the issue may be more complicated! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2381) The included jetty server does not support UTF-8
[ https://issues.apache.org/jira/browse/SOLR-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003983#comment-13003983 ] Robert Muir commented on SOLR-2381: --- {quote} If I find a svn with jetty-6.1.26 sources I will look into that one also. {quote} But can you test the patched version of jetty we have here? This is more useful because its the version we include (its the only one we worry about) The included jetty server does not support UTF-8 Key: SOLR-2381 URL: https://issues.apache.org/jira/browse/SOLR-2381 Project: Solr Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.1, 4.0 Attachments: SOLR-2381.patch, SOLR-ServletOutputWriter.patch, jetty-6.1.26-patched-JETTY-1340.jar, jetty-util-6.1.26-patched-JETTY-1340.jar Some background here: http://www.lucidimagination.com/search/document/6babe83bd4a98b64/which_unicode_version_is_supported_with_lucene Some possible solutions: * wait and see if we get resolution on http://jira.codehaus.org/browse/JETTY-1340. To be honest, I am not even sure where jetty is being maintained (there is a separate jetty project at eclipse.org with another bugtracker, but the older releases are at codehaus). * include a patched version of jetty with correct utf-8, using that patch. * remove jetty and include a different container instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Resolved: (LUCENE-2954) CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk)
[ https://issues.apache.org/jira/browse/LUCENE-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-2954. --- Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) Committed 3.x revision: 1079381 Committed 3.1 revision: 1079382 Committed trunk revision: 1079386 CheckIndex prints wrong version number on 3.1 indexes (and posibly also in trunk) - Key: LUCENE-2954 URL: https://issues.apache.org/jira/browse/LUCENE-2954 Project: Lucene - Java Issue Type: Bug Components: Index Affects Versions: 3.1 Reporter: Uwe Schindler Assignee: Uwe Schindler Priority: Blocker Fix For: 3.1, 3.2, 4.0 Attachments: LUCENE-2954.patch, LUCENE-2954.patch When you run CheckIndex on an index created/updated with 3.1, it prints about the SegmentInfos: {noformat} Segments file=segments_g19 numSegments=5 version=-11 [Lucene 1.3 or prior] {noformat} We should fix CheckIndex and also verify other cases where version numbers are printed out. In trunk the issue may be more complicated! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] Lucene and Solr 3.1 release candidate
Hi Sanne, Solr (and some Lucene modules) have several non-Mavenized dependencies. To work around this, the Maven build has a profile called bootstrap. If you check out the source (or use the source distribution) you can place all non-Mavenized dependencies in your local repository as follows (from the top-level directory containing lucene, solr, etc.): ant get-maven-poms mvn -N -P bootstrap install Maybe there should also be a way to deploy these to an internal repository? Steve -Original Message- From: Sanne Grinovero [mailto:sanne.grinov...@gmail.com] Sent: Tuesday, March 08, 2011 6:44 AM To: dev@lucene.apache.org Subject: Re: [VOTE] Lucene and Solr 3.1 release candidate Hello, the lucene-solr-grandparent pom [1] file mentions a jetty version 6.1.26-patched-JETTY-1340 which is not available in the repositories where I would expect it. Do I need to enable some additional repository? This seems related to SOLR-2381. I think for people using Solr as their dependency via Maven, this is a blocker; of course not everyone uses it so I've no strong opinions about this, but thought to let you know. Personally I'd depend on the released version of jetty, and document that this bug is not fixed until Jetty version XY is released; in alternative, I'd add keep the pom as is but instructions and warnings in the release notes would be very welcome. (I couldn't find a Chances.html for Solr?) Regards, Sanne [1] http://people.apache.org/~rmuir/staging_area/lucene-solr-3.1RC0- rev1078688/lucene-3.1RC0/maven/org/apache/lucene/lucene-solr- grandparent/3.1.0/lucene-solr-grandparent-3.1.0.pom 2011/3/8 Shai Erera ser...@gmail.com: I found what seems to be a glitch in StopFilter's ctors -- the boolean 'enablePosInc' was removed from the ctors and users now have to use the setter instead. However, the ctors do default to 'true' if the passed in Version is onOrAfter(29). All of FilteringTokenFilter sub-classes include the enablePosIncr in their ctors, including FilteringTF itself. Therefore I assume the parameter was mistakenly dropped from StopFilter's ctors. Also, the @deprecated text doesn't mention how should I enable/disable it, and reading the source code doesn't help either, since the setter/getter are in FilteringTF. Also, LengthFilter has a deprecated ctor, but the class was added on Nov 16 and I don't see it in 3.0.3. So perhaps we can remove that ctor (and add a @since tag to the class)? I don't know if these two warrant a new RC but I think they are important to fix. Shai On Mon, Mar 7, 2011 at 5:52 PM, Smiley, David W. dsmi...@mitre.org wrote: So https://issues.apache.org/jira/browse/SOLR-2405 didn't make it in yesterday (apparently it didn't)? :-( Darn... maybe I shouldn't have waited for a committer to agree with the issue. I would have had it in Saturday. ~ David Smiley On Mar 7, 2011, at 1:32 AM, Robert Muir wrote: Hi all, I have posted a release candidate for both Lucene 3.1 and Solr 3.1, both from revision 1078688 of http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/ Thanks for all your help! Please test them and give your votes, the tentative release date for both versions is Sunday, March 13th, 2011. Only votes from Lucene PMC are binding, but everyone is welcome to check the release candidates and voice their approval or disapproval. The vote passes if at least three binding +1 votes are cast. The release candidates are produced in parallel because in 2010 we merged the development of Lucene and Solr in order to produce higher quality releases. While we voted to reserve the right to release Lucene by itself, in my opinion we should definitely try to avoid this unless absolutely necessary, as it would ultimately cause more work and complication: instead it would be far easier to just fix whatever issues are discovered and respin both releases again. Because of this, I ask that you cast a single vote to cover both releases. If the vote succeeds, both sets of artifacts can go their separate ways to the different websites. Artifacts are located here: http://s.apache.org/solrcene31rc0 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-2381) The included jetty server does not support UTF-8
[ https://issues.apache.org/jira/browse/SOLR-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003991#comment-13003991 ] Bernd Fehling commented on SOLR-2381: - I first tested the patched version jetty-6.1.26-patched and still have the bug. Then I used jetty-7.3.0 and got the same bug. Then I debugged jetty-7.3.0 and located the bug and saw that it is fixed in jetty-7.3.1. And now I need the sources of the patched jetty-6.1.26 to see why there is still a bug and fix taht one also. Or if you know where to look for then I leave to you, no problem. May be you have contact to the jetty developer and they want to fix this for jetty-6.1.26 at all and make a jetty-6.1.27 out of it? The included jetty server does not support UTF-8 Key: SOLR-2381 URL: https://issues.apache.org/jira/browse/SOLR-2381 Project: Solr Issue Type: Bug Reporter: Robert Muir Assignee: Robert Muir Priority: Blocker Fix For: 3.1, 4.0 Attachments: SOLR-2381.patch, SOLR-ServletOutputWriter.patch, jetty-6.1.26-patched-JETTY-1340.jar, jetty-util-6.1.26-patched-JETTY-1340.jar Some background here: http://www.lucidimagination.com/search/document/6babe83bd4a98b64/which_unicode_version_is_supported_with_lucene Some possible solutions: * wait and see if we get resolution on http://jira.codehaus.org/browse/JETTY-1340. To be honest, I am not even sure where jetty is being maintained (there is a separate jetty project at eclipse.org with another bugtracker, but the older releases are at codehaus). * include a patched version of jetty with correct utf-8, using that patch. * remove jetty and include a different container instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org