[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599660#action_12599660 ] Bojan Smid commented on SOLR-236: - I will try to bring this patch up to date. Currently I see two main problems: 1) The patch applies to trunk, but it doesn't compile. The problem occurs mainly because of changes in Search Components (for instance, some method signatures which CollapseComponent implements were changed). I have this fixed locally (more or less), but I have to test it before posting new version of patch. 2) It seems that CollapseComponent can't be used in chain with QueryComponent, but instead of it. CollapseComponent basically copies QueryComponent querying logic and adds some of it's own. I guess this isn't the right way to go. CollapseComponent should contain only collapsing logic and should be chainable with other components. Can anyone confirm if I'm right here? Of course, there might be some fundamental reason why CollapseComponent had to be implemented this way. Does anyone else see any other issues with this component? Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599661#action_12599661 ] Gunnar Wagenknecht commented on SOLR-236: - Hi / Hallo, Thanks for your mail. Unfortunately, I won't be able to answer it soon. I'm on vacation till June 2nd without access to my mails. Vielen Dank für die Email. Leider werde ich nicht sofort antworten. Ich bin bis 2. Juni im Urlaub ohne Zugriff auf mein Postfach. -Gunnar -- Gunnar Wagenknecht [EMAIL PROTECTED] http://wagenknecht.org/ Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Solr nightly build failure
init-forrest-entities: [mkdir] Created dir: /tmp/apache-solr-nightly/build compile-common: [mkdir] Created dir: /tmp/apache-solr-nightly/build/common [javac] Compiling 34 source files to /tmp/apache-solr-nightly/build/common [javac] Note: /tmp/apache-solr-nightly/src/java/org/apache/solr/common/util/FastInputStream.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. compile: [mkdir] Created dir: /tmp/apache-solr-nightly/build/core [javac] Compiling 314 source files to /tmp/apache-solr-nightly/build/core [javac] /tmp/apache-solr-nightly/src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java:45: cannot find symbol [javac] symbol : class SpanScorer [javac] location: package org.apache.lucene.search.highlight [javac] import org.apache.lucene.search.highlight.SpanScorer; [javac] ^ [javac] /tmp/apache-solr-nightly/src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java:144: cannot find symbol [javac] symbol : class SpanScorer [javac] location: class org.apache.solr.highlight.DefaultSolrHighlighter [javac] private SpanScorer getSpanQueryScorer(Query query, String fieldName, CachingTokenFilter tokenStream, SolrQueryRequest request) throws IOException { [javac] ^ [javac] /tmp/apache-solr-nightly/src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java:147: cannot find symbol [javac] symbol : class SpanScorer [javac] location: class org.apache.solr.highlight.DefaultSolrHighlighter [javac] return new SpanScorer(query, fieldName, tokenStream); [javac] ^ [javac] /tmp/apache-solr-nightly/src/java/org/apache/solr/highlight/DefaultSolrHighlighter.java:150: cannot find symbol [javac] symbol : class SpanScorer [javac] location: class org.apache.solr.highlight.DefaultSolrHighlighter [javac] return new SpanScorer(query, null, tokenStream); [javac] ^ [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 4 errors BUILD FAILED /tmp/apache-solr-nightly/build.xml:247: The following error occurred while executing this line: /tmp/apache-solr-nightly/build.xml:124: Compile failed; see the compiler error output for details. Total time: 11 seconds
Build failed in Hudson: Solr-trunk #451
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/451/changes -- [...truncated 1018 lines...] AUclient/ruby/flare/public/images/pie_93.png AUclient/ruby/flare/public/images/pie_57.png AUclient/ruby/flare/public/images/pie_94.png AUclient/ruby/flare/public/images/pie_58.png AUclient/ruby/flare/public/images/pie_95.png AUclient/ruby/flare/public/images/pie_59.png AUclient/ruby/flare/public/images/pie_96.png AUclient/ruby/flare/public/images/pie_97.png AUclient/ruby/flare/public/images/pie_0.png AUclient/ruby/flare/public/images/pie_98.png AUclient/ruby/flare/public/images/pie_1.png AUclient/ruby/flare/public/images/pie_99.png AUclient/ruby/flare/public/images/pie_2.png AUclient/ruby/flare/public/images/pie_3.png AUclient/ruby/flare/public/images/pie_4.png AUclient/ruby/flare/public/images/pie_5.png AUclient/ruby/flare/public/images/pie_6.png AUclient/ruby/flare/public/images/pie_7.png AUclient/ruby/flare/public/images/pie_8.png AUclient/ruby/flare/public/images/pie_9.png AUclient/ruby/flare/public/images/pie_20.png AUclient/ruby/flare/public/images/pie_21.png AUclient/ruby/flare/public/images/pie_22.png AUclient/ruby/flare/public/images/pie_23.png AUclient/ruby/flare/public/images/pie_60.png AUclient/ruby/flare/public/images/pie_24.png AUclient/ruby/flare/public/images/pie_61.png AUclient/ruby/flare/public/images/pie_25.png AUclient/ruby/flare/public/images/pie_62.png AUclient/ruby/flare/public/images/pie_26.png AUclient/ruby/flare/public/images/pie_63.png AUclient/ruby/flare/public/images/pie_27.png AUclient/ruby/flare/public/images/pie_64.png AUclient/ruby/flare/public/images/pie_28.png AUclient/ruby/flare/public/images/pie_29.png AUclient/ruby/flare/public/images/pie_65.png AUclient/ruby/flare/public/images/pie_66.png AUclient/ruby/flare/public/images/pie_67.png AUclient/ruby/flare/public/images/pie_68.png AUclient/ruby/flare/public/images/pie_69.png AUclient/ruby/flare/public/images/pie_30.png AUclient/ruby/flare/public/images/pie_31.png AUclient/ruby/flare/public/images/pie_32.png AUclient/ruby/flare/public/images/pie_33.png AUclient/ruby/flare/public/images/pie_34.png AUclient/ruby/flare/public/images/pie_70.png AUclient/ruby/flare/public/images/pie_35.png AUclient/ruby/flare/public/images/pie_71.png AUclient/ruby/flare/public/images/pie_36.png AUclient/ruby/flare/public/images/pie_72.png AUclient/ruby/flare/public/images/pie_37.png AUclient/ruby/flare/public/images/pie_73.png AUclient/ruby/flare/public/images/pie_38.png AUclient/ruby/flare/public/images/pie_74.png AUclient/ruby/flare/public/images/pie_39.png AUclient/ruby/flare/public/images/pie_75.png AUclient/ruby/flare/public/images/pie_76.png AUclient/ruby/flare/public/images/pie_77.png AUclient/ruby/flare/public/images/pie_78.png AUclient/ruby/flare/public/images/x-close.gif AUclient/ruby/flare/public/dispatch.fcgi A client/ruby/flare/public/robots.txt A client/ruby/flare/public/500.html A client/ruby/flare/public/javascripts A client/ruby/flare/public/javascripts/prototype.js A client/ruby/flare/public/javascripts/effects.js A client/ruby/flare/public/javascripts/dragdrop.js A client/ruby/flare/public/javascripts/application.js A client/ruby/flare/public/javascripts/controls.js A client/ruby/flare/public/404.html A client/ruby/flare/public/.htaccess A client/ruby/flare/public/stylesheets A client/ruby/flare/public/stylesheets/flare.css A client/ruby/flare/public/favicon.ico A client/ruby/solr-ruby A client/ruby/solr-ruby/solr A client/ruby/solr-ruby/solr/conf AUclient/ruby/solr-ruby/solr/conf/schema.xml A client/ruby/solr-ruby/solr/conf/protwords.txt A client/ruby/solr-ruby/solr/conf/stopwords.txt AUclient/ruby/solr-ruby/solr/conf/solrconfig.xml A client/ruby/solr-ruby/solr/conf/xslt A client/ruby/solr-ruby/solr/conf/xslt/example.xsl A client/ruby/solr-ruby/solr/conf/scripts.conf A client/ruby/solr-ruby/solr/conf/admin-extra.html A client/ruby/solr-ruby/solr/conf/synonyms.txt A client/ruby/solr-ruby/solr/lib A client/ruby/solr-ruby/test A client/ruby/solr-ruby/test/unit A client/ruby/solr-ruby/test/unit/standard_response_test.rb A client/ruby/solr-ruby/test/unit/document_test.rb AUclient/ruby/solr-ruby/test/unit/select_test.rb AU
Re: [jira] Resolved: (SOLR-505) Give RequestHandlers the possiblity to suppress the generation of HTTP caching headers
Hi Otis! Otis Gospodnetic (JIRA) schrieb: Resolution: Fixed Danke, Thomas. You are welcome... ;-) I have not expected such a fast response time for a patch that was lingering three months in JIRA without a comment... CU Thomas
[jira] Commented: (SOLR-553) Highlighter does not match phrase queries correctly
[ https://issues.apache.org/jira/browse/SOLR-553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599669#action_12599669 ] Mark Miller commented on SOLR-553: -- Just to point out, as I am not sure its clear, the SpanScorer is just as fast as the old Scorer when no Phrase's, or Span's are in the query. Mark H actually tested it as slightly faster, though thats a bit odd. When there is a Span or Phrase, none Span/Phrase clauses of the Query are still highlighted the same and at the same speed as the original Scorer...it is just the Span/Phrase clauses that fire up a MemoryIndex and have getSpans called against it. So you really only pay for the extra position sensitive part where actually needed. Highlighter does not match phrase queries correctly --- Key: SOLR-553 URL: https://issues.apache.org/jira/browse/SOLR-553 Project: Solr Issue Type: New Feature Components: highlighter Affects Versions: 1.2 Environment: all Reporter: Brian Whitman Assignee: Otis Gospodnetic Fix For: 1.3 Attachments: highlighttest.xml, SOLR-553-SC.patch, Solr-553.patch, Solr-553.patch, Solr-553.patch http://www.nabble.com/highlighting-pt2%3A-returning-tokens-out-of-order-from-PhraseQuery-to16156718.html Say we search for the band I Love You But I've Chosen Darkness .../selectrows=100q=%22I%20Love%20You%20But%20I\'ve%20Chosen%20Darkness%22fq=type:htmlhl=truehl.fl=contenthl.fragsize=500hl.snippets=5hl.simple.pre=%3Cspan%3Ehl.simple.post=%3C/span%3E The highlight returns a snippet that does have the name altogether: Lights (Live) : spanI/span spanLove/span spanYou/span But spanI've/span spanChosen/span spanDarkness/span : But also returns unrelated snips from the same page: Black Francis Shop spanI/span Think spanI/span spanLove/span spanYou/span A correct highlighter should not return snippets that do not match the phrase exactly. LUCENE-794 (not yet committed, but seems to be ready) fixes up the problem from the Lucene end. Solr should get it too. Related: SOLR-575 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic updated SOLR-236: -- Comment: was deleted Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-569) SimpleFacet binarysearch optimization
[ https://issues.apache.org/jira/browse/SOLR-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599698#action_12599698 ] Jason Rutherglen commented on SOLR-569: --- Can use the Apache JDK6 version. SimpleFacet binarysearch optimization - Key: SOLR-569 URL: https://issues.apache.org/jira/browse/SOLR-569 Project: Solr Issue Type: Improvement Components: search Affects Versions: 1.3 Reporter: Jason Rutherglen Priority: Minor Looks like the SimpleFacets.getFieldCacheCounts could have small optimization: {noformat} startTermIndex = Arrays.binarySearch(terms,prefix,nullStrComparator); if (startTermIndex0) startTermIndex=-startTermIndex-1; // find the end term. \u isn't a legal unicode char, but only compareTo // is used, so it should be fine, and is guaranteed to be bigger than legal chars. endTermIndex = Arrays.binarySearch(terms,prefix+\u\u\u\u,nullStrComparator); endTermIndex = -endTermIndex-1; {noformat} to: {noformat} startTermIndex = Arrays.binarySearch(terms,prefix,nullStrComparator); if (startTermIndex0) startTermIndex=-startTermIndex-1; // find the end term. \u isn't a legal unicode char, but only compareTo // is used, so it should be fine, and is guaranteed to be bigger than legal chars. endTermIndex = Arrays.binarySearch(terms, startTermIndex, terms.length, prefix+\u\u\u\u,nullStrComparator); endTermIndex = -endTermIndex-1; {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599714#action_12599714 ] Bojan Smid commented on SOLR-236: - Hi Oleg. I'll look into this also. In case you have any working code, you can mail it to me, and I'll see what can be reused. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Otis Gospodnetic Attachments: field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-561) Solr replication by Solr (for windows also)
[ https://issues.apache.org/jira/browse/SOLR-561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12599559#action_12599559 ] noble.paul edited comment on SOLR-561 at 5/25/08 9:48 PM: -- We shall post a patch in the next few days The design is as follows: * SnapShooter.java : registered as a listener on _postCommit/postOptimize_ . It makes a copy of the latest index to a new snapshot folder (same as it is today). Only in master. It can optionally take in a 'snapDir' as configuration if the snapshot is to be created ina folder other than the data directory. * ReplicationHandler: A requesthandler. This is registered in master slave. It takes in the following config in the slave. Master node just needs an empty requesthandler registration. {code:xml|title=solrconfig.xml} requestHandler name=replication class=solr.ReplicationHandler str name=masterUrlhttp://host:port/solr/corename/replication/str str name=pollInterValHH:MM:SS/str /requestHandler {code} ReplicationHandler Implements the following methods. Every method is invoked over *http GET*. These methods are usually trigerred from the slave (over http) or timer (for snappull). Admin can provide means to invoke some methods like snappull,snapshoot . * CMD_GET_FILE: _(command=filecontentsnapshhot=snapshotnamefile= filenameoffset=fileOffsetlen=length-ofchunkchecksum=true|false)_ . This is invoked by a slave only to fetch a file or a part of it . This uses a custom format (described later) * CMD_LATEST_SNAP: _(command=latestsnap)_. Returns the name of the latest snapshot (a namedlist response) * CMD_GET_SNAPSHOTS: _(command=snaplist)_. Returns a list of all snapshot names (a namedlist response) * CMD_GET_FILE_LIST: _(command=filelistsnap=snapshotname)_ . A list of all the files in the snapshot .conains name, lastmodified,size. (a namedlist response) * CMD_SNAP_SHOOT: _(command=snapshoot)_. Do a force snapshoot. * CMD_DISABLE_SNAPPOLL: _(command=disablesnappoll)_. For stopping the timer task * CMD_SNAP_PULL : _(command=snappull)_. Does the following operations (done in slave). It is mostly triggered from a timertask based on the pollInterval value. ** calls a CMD_LATEST_SNAP to the master and get the latest snapshot name ** checks if it has the same (or if a snappull is going on) ** if it is to be pulled, call CMD_GET_FILE_LIST to the master ** for each file in the list make a call CMD_GET_FILE to the master. This command works in the following way *** the server reads the file stream *** It uses a CustomStreamResponseWriter _(wt=filestream)_ to write the content. It has a packetSize (say 1mb) *** It writes an int for length and another long for Adler32 checksum (if checksum=true). The packets are written one after another till EOF or an Exception. *** SnapPuller.java In the client reads the packet length and checksum and tries to read the packet.If it is unable to read the given packet or the checksum does not match or there is an exception , it closes the connection and makes a new CMD_GET_FILE command with the offset = (totalbytesReceived). If everything is fine the packets are read till the __bytesDownloaded == fileSize__ *** This is continued till all the files are downloaded. ** creates a folder index.tmp ** for each file in the copied snapshot , try to create a hardlink in the index.tmp folder.(runs an OS specific command) ** If hardlink creation fails use a copy ** rename _index.tmp_ to _index_ ** calls a commit on the updatehandler **note: The download tries to use the same stream to download the complete file . Please comment on the design was (Author: noble.paul): We shall post a patch in the next few days The design is as follows: * SnapShooter.java : registered as a listener on _postCommit/postOptimize_ . It makes a copy of the latest index to a new snapshot folder (same as it is today). Only in master * ReplicationHandler: A requesthandler. This is registered in master slave. It takes in the following config in the slave {code:xml} str name=masterUrlhttp://host:port/solr/corename/replication/str str name=pollInterValHH:MM:SS/str {code} Implements the following methods * CMD_GET_FILE: _(command=filecontentsnapshhot=snapshotnamefile= filenameoffset=fileOffsetlen=length-ofchunkchecksum=true|false)_ Would * CMD_LATEST_SNAP: _(command=latestsnap)_. Returns the name of the latest snapshot (a namedlist response) * CMD_GET_SNAPSHOTS: _(command=snaps)_. Returns a list of all snapshotnames (a namedlist response) * CMD_GET_FILE_LIST: _(command=filelistsnap=snapshotname)_ . A list of all the files in the snapshot .conains name, lastmodified,size. (a namedlist response) * CMD_SNAP_SHOOT: _(command=snapshoot)_. Do a force snapshoot. * CMD_DISABLE_SNAPPOLL: _(command=disablesnappoll)_. For stopping the timer task * CMD_SNAP_PULL : _(command=snappull)_. Does the following operations