[jira] Updated: (SOLR-1737) Add a FieldStreamDataSource
[ https://issues.apache.org/jira/browse/SOLR-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1737: - Attachment: SOLR-1737.patch Add a FieldStreamDataSource --- Key: SOLR-1737 URL: https://issues.apache.org/jira/browse/SOLR-1737 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1737.patch TikaEntityProcessor needs a DataSource which returns a Stream instead of a Reader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: how to sort facets?
hi, thanx. So Long, David Rühr Koji Sekiguchi schrieb: David Rühr wrote: hi, we make a Filter with Faceting feature. In our faceting list the order is by count by the matches: facet.sort=count but we need to sort by = facet.sort=manufacturer. Url manipulation doesn't change anything, why? select?fl=*%2Cscorefq=type%3Apagespellcheck=truefacet=truefacet.mincount=1facet.sort=manufacturerbf=log(supplier_faktor)facet.field=supplierfacet.field=manufacturerversion=1.2q=kindstart=0rows=10 so long, David Try facet.sort=index. facet.sort accepts only count or index. http://wiki.apache.org/solr/SimpleFacetParameters#facet.sort Koji Mit freundlichen Grüßen, David Rühr PHP Programmierer -- Marketing Factory Consulting GmbH * mailto:d...@marketing-factory.de Stephanienstraße 36 * Tel.: +49 211-361176-58 D-40211 Düsseldorf, Germany * Fax: +49 211-361176-99 Amtsgericht Düsseldorf HRB 53971 * http://www.marketing-factory.de/ Geschäftsführer:Peter Faisst | Katja Faisst Karoline Steinfatt | Christoph Allefeld | Markus M. Kimmel
Hudson build is back to normal: Solr-trunk #1044
See http://hudson.zones.apache.org/hudson/job/Solr-trunk/1044/changes
Using MoreLikeThisHandler
Hi, I am trying to work with the MoreLikeThisHandler inorder to get the similar documents. Here is my configuration in Scema.xml. fields field name=id type=sint indexed=true stored=true required=true termVectors=true/ field name=title type=text indexed=true stored=false termVectors= true/ field name=keywordGroup type=string indexed=true stored=false multiValued=true termVectors=true/ field name=tagText type=text indexed=true stored=true multiValued= true default= termVectors=true/ /fields n Configuration in solrconfig.xml requestHandler name=/mlt class=solr.MoreLikeThisHandler lst name=defaults str name=mlt.fltitle,tagText,keywordGroup/str str name=mlt.qftitle^1.5 tagText keywordGroup^0.5/str str name=mlt.mintf1/str str name=mlt.mindf1/str str name=mlt.boosttrue/str str name=mlt.match.includetrue/str /lst /requestHandler and i fire the query like this http://10.99.82.12:8080/Dev/mlt/?q=id:7735mlt.mindf=1mlt.mintf=1mlt.boost=truemlt.match.include=truemlt.fl=title,tagText,keywordGrouphttp://10.99.82.12:8080/Dev/mlt/?q=id:7735mlt.mindf=1mlt.mintf=1mlt.boost=truemlt.match.include=truemlt.fl=title,tagTexthttp://localhost:8983/solr/mlt?q=id:100 I do get some results but not accurate though.. Now i have a couple of queries. 1. Is this configuration is correct for getting the similar documents. 2. Is it poosible to support different boost for each of the keywordGroup? If so please give me hint how can i achieve this? Thanks, Nayan K
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805500#action_12805500 ] Jan Høydahl commented on SOLR-1725: --- It looks logical and nice. However, I'm leaning towards keeping it very simple. The simplest is one script per processor, since that will always work. As more and more update processors are written, in Java, JS, Jython and more, it would be a clear benefit if Administrators don't need to care about the underlying implementation, but can use same way of configuring each one - That's why I opt for the top-level param structure as default. I have years of experience with FAST document processing, which is really a killer feature, mainly because it's so dead simple. Drop in a python script with a deployment descriptor and start using it in your pipelines. You don't care if the implementation is pure Python, a C library wrapper or whatever, you just care about what parameters to give it. I see this patch as one big step towards the same simplicity with Solr! Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805511#action_12805511 ] Grant Ingersoll commented on SOLR-1163: --- Uri, Is this patch still up to date? Is it a contrib? Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: CHANGES.txt updates for SOLR-1516 and SOLR-1592
Thanks, Hoss, no problemo, appreciate it! On 1/26/10 12:22 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : Not to be a best, but there's no CHANGES.txt updates for SOLR-1516 and : SOLR-1592. Could someone update them? A trivial patch is attached... Sorry about that. Every change (with the possible exception of fixing formating or documentation typos) *should* have a CHANGES.txt entry. Every change that affects the public API *MUST* have a CHANGES.txt entry. Committed revision 903398. -Hoss ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805532#action_12805532 ] Yonik Seeley commented on SOLR-1725: Cool feature! Performance: - It looks like scripts are read from the resource loader and parsed again (eval) for every update request. This can be pretty expensive, esp for those scripting languages that generate java class files instead of using an interpreter. One way to combat this would be to cache and reuse them. Interface: - Should we have a way to specify a script in-line (in solrconfig.xml)? - Or even cooler... allow passing of scripts as parameters in the update request! Think about the power of pointing Solr to a CSV file and also providing document transformers field manipulators on the fly! - This seems to raise the visibility of the UpdateCommand classes, directly exposing them to users w/o plugins. We should perhaps consider interface cleanups on these classes at the same time as this issue. - Examples! Using javascript (since it's both fast and included in JDK6), let's see what the scripts are for some common usecases. This both helps improve the design as well as lets other people give feedback w/o having to read through code. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1738) Upgrade to Tika 0.6
Upgrade to Tika 0.6 --- Key: SOLR-1738 URL: https://issues.apache.org/jira/browse/SOLR-1738 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 See title. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1728) ResponseWriters should support byte[], ByteBuffer
[ https://issues.apache.org/jira/browse/SOLR-1728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805578#action_12805578 ] Yonik Seeley commented on SOLR-1728: Seems to make sense from a completeness point of view. It also allows a closer semantic mapping (i.e. we could use the closest equivalent to byte arrays for python ruby). ResponseWriters should support byte[], ByteBuffer - Key: SOLR-1728 URL: https://issues.apache.org/jira/browse/SOLR-1728 Project: Solr Issue Type: Improvement Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Only BinaryResponseWriter supports byte[] and ByteBuffer. Other writers also should support these -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: configure FastVectorHihglighter in trunk
I am having some trouble to make it work. I am debuging the code and I see when de FastVectorHighlighter constructor is created, the parameters that it recieves are ok // get FastVectorHighlighter instance out of the processing loop FastVectorHighlighter fvh = new FastVectorHighlighter( // FVH cannot process hl.usePhraseHighlighter parameter per-field basis params.getBool( HighlightParams.USE_PHRASE_HIGHLIGHTER, true ), // FVH cannot process hl.requireFieldMatch parameter per-field basis params.getBool( HighlightParams.FIELD_MATCH, false ), getFragListBuilder( params ), getFragmentsBuilder( params ) ); The query here is ok aswell: FieldQuery fieldQuery = fvh.getFieldQuery( query ); But I can't see what's in fieldQuery (just a memory path and don't know to do someting similar to toString()) The problem I see is in: String[] snippets = highlighter.getBestFragments( fieldQuery, req.getSearcher().getReader(), docId, fieldName, params.getFieldInt( fieldName, HighlightParams.FRAGSIZE, 100 ), params.getFieldInt( fieldName, HighlightParams.SNIPPETS, 1 ) ); snippets ends up with an empty array so it jumps to: alternateField( docSummaries, params, doc, fieldName ); In solrconfig.xml I added: fragListBuilder name=simple class=org.apache.solr.highlight.SimpleFragListBuilder default=false/ fragmentsBuilder name=colored class=org.apache.solr.highlight.MultiColoredScoreOrderFragmentsBuilder default=false/ Maybe I am missing something... any idea? Using the doHighlightingByHighlighter highlight works perfect. **I also have noticed that using snippet fragment size to 0 (wich in normal highlight returns the whole field highlighted) gives an error. Koji Sekiguchi-2 wrote: Marc Sturlese wrote: How do I activate FastVectorHighlighter in trunk? Wich of those params sets it up? !-- Configure the standard fragListBuilder -- fragListBuilder name=simple class=org.apache.solr.highlight.SimpleFragListBuilder default=true/ !-- Configure the standard fragmentsBuilder -- fragmentsBuilder name=colored class=org.apache.solr.highlight.MultiColoredScoreOrderFragmentsBuilder default=true/ fragmentsBuilder name=scoreOrder class=org.apache.solr.highlight.ScoreOrderFragmentsBuilder default=true/ Thanks in advance. You do not need to activate it. DefaultSolrHighlighter, which is the default SolrHighlighter impl, calls automatically uses FVH when you specify field names that are termVectors, termPositions and termOffsets are true through hl.fl parameter. If you want to use multi colored tag feature, you need to specify MultiColored*FragmentsBuilder in solrconfig.xml. Koji -- http://www.rondhuit.com/en/ -- View this message in context: http://old.nabble.com/configure-FastVectorHihglighter-in-trunk-tp27319976p27344139.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
[ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805601#action_12805601 ] Uri Boness commented on SOLR-1163: -- Actually I've been working on a new version for the explorer which I plan to put soon as a patch here. Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the required graphics. This module is maven2 based and is configured in such a way that all GWT related tools/libraries are automatically downloaded when the modules is compiled. One of the artifacts of the build is a war file which can be deployed in any servlet container. NOTE: this client works best on WebKit based browsers (for performance reason) but also works on firefox and ie 7+. That said, it should be taken into account that it is still under development. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805612#action_12805612 ] Uri Boness commented on SOLR-1725: -- {quote} Performance: It looks like scripts are read from the resource loader and parsed again (eval) for every update request. This can be pretty expensive, esp for those scripting languages that generate java class files instead of using an interpreter. One way to combat this would be to cache and reuse them. {quote} Yes, indeed the scripts are evaluated per request but for a reason. One of the goals here is to keep the scripts as close as possible to the update processor interface, so the functions in the scripts has the same signature as the methods in the processor. But in order for the scripts to be flexible I decided to introduce some global scoped variables which are accessible in the functions. (currently the current solr request, response and a logger are there). The problem is that the API only defines 3 scopes where you can register variables and the lowest one is the engine itself. Since the evaluation of a script is done on the engine level as well, when using this API together with the global variables I don't think you can escape the need for creating an engine per request (thus, also evaluating the scripts). But I agree with you that if there is a way around it, caching the evaluated/compiled scripts will definitely boost things up. I'll need to investigate this further and come up with alternatives (I already have some ideas using ThreadLocals). bq. Should we have a way to specify a script in-line (in solrconfig.xml)? Personally I prefer keeping the solrconfig.xml as clean as possible. I do however think that a standardization of Solr scripting support in general can be great. (for example, have a scripts folder under _solr.solr.home_ were all the scripts are placed, or come up with a standard configuration structure for the scripts... perhaps something in the direction Hoss suggested above). bq. This seems to raise the visibility of the UpdateCommand classes, directly exposing them to users w/o plugins. We should perhaps consider interface cleanups on these classes at the same time as this issue. +1 bq. Examples! Using javascript (since it's both fast and included in JDK6), let's see what the scripts are for some common usecases. This both helps improve the design as well as lets other people give feedback w/o having to read through code. Yep.. that would probably be very helpful. basically I think anyone who's ever written an update processor can perhaps try to convert it to a script and see how it works. The usual use case for me is to just add a few fields which are derived from the other fields, but perhaps there are some other more interesting use cases out there. I guess these examples should be put in the Wiki, right? Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805672#action_12805672 ] Uri Boness commented on SOLR-1725: -- Been looking more into it and I think there's a nice way in which we can cache the evaluated scripts. But... (and there's always a but) to make it work cleanly we need to be able to extend the scripting support, which means we need to be able to compile the code in Java 6. And this brings us back to Mark's comment above on how do we want to do that. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805678#action_12805678 ] Yonik Seeley commented on SOLR-1725: As you pointed out, Java5 is EOL'd already and Sun/Oracle doesn't even let you download JDK5 anymore w/o registration. Wouldn't hurt my feelings to move to Java6. After all, the SolrCloud stuff we're working on uses zookeeper which requires 1.6. Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1725) Script based UpdateRequestProcessorFactory
[ https://issues.apache.org/jira/browse/SOLR-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805691#action_12805691 ] Uri Boness commented on SOLR-1725: -- Well then... I just hope others will not shed tears as well and we can make Solr 1.5 Java 6 compiled :-) Script based UpdateRequestProcessorFactory -- Key: SOLR-1725 URL: https://issues.apache.org/jira/browse/SOLR-1725 Project: Solr Issue Type: New Feature Components: update Affects Versions: 1.4 Reporter: Uri Boness Attachments: SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch, SOLR-1725.patch A script based UpdateRequestProcessorFactory (Uses JDK6 script engine support). The main goal of this plugin is to be able to configure/write update processors without the need to write and package Java code. The update request processor factory enables writing update processors in scripts located in {{solr.solr.home}} directory. The functory accepts one (mandatory) configuration parameter named {{scripts}} which accepts a comma-separated list of file names. It will look for these files under the {{conf}} directory in solr home. When multiple scripts are defined, their execution order is defined by the lexicographical order of the script file name (so {{scriptA.js}} will be executed before {{scriptB.js}}). The script language is resolved based on the script file extension (that is, a *.js files will be treated as a JavaScript script), therefore an extension is mandatory. Each script file is expected to have one or more methods with the same signature as the methods in the {{UpdateRequestProcessor}} interface. It is *not* required to define all methods, only those hat are required by the processing logic. The following variables are define as global variables for each script: * {{req}} - The SolrQueryRequest * {{rsp}}- The SolrQueryResponse * {{logger}} - A logger that can be used for logging purposes in the script -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: configure FastVectorHihglighter in trunk
Can you give me the following info to reproduce the problem? * field data * query string * field definition in schema.xml **I also have noticed that using snippet fragment size to 0 (wich in normal highlight returns the whole field highlighted) gives an error. Hmm, I should check it. Can you open a JIRA issue? Thank you, Koji -- http://www.rondhuit.com/en/ Marc Sturlese wrote: I am having some trouble to make it work. I am debuging the code and I see when de FastVectorHighlighter constructor is created, the parameters that it recieves are ok // get FastVectorHighlighter instance out of the processing loop FastVectorHighlighter fvh = new FastVectorHighlighter( // FVH cannot process hl.usePhraseHighlighter parameter per-field basis params.getBool( HighlightParams.USE_PHRASE_HIGHLIGHTER, true ), // FVH cannot process hl.requireFieldMatch parameter per-field basis params.getBool( HighlightParams.FIELD_MATCH, false ), getFragListBuilder( params ), getFragmentsBuilder( params ) ); The query here is ok aswell: FieldQuery fieldQuery = fvh.getFieldQuery( query ); But I can't see what's in fieldQuery (just a memory path and don't know to do someting similar to toString()) The problem I see is in: String[] snippets = highlighter.getBestFragments( fieldQuery, req.getSearcher().getReader(), docId, fieldName, params.getFieldInt( fieldName, HighlightParams.FRAGSIZE, 100 ), params.getFieldInt( fieldName, HighlightParams.SNIPPETS, 1 ) ); snippets ends up with an empty array so it jumps to: alternateField( docSummaries, params, doc, fieldName ); In solrconfig.xml I added: fragListBuilder name=simple class=org.apache.solr.highlight.SimpleFragListBuilder default=false/ fragmentsBuilder name=colored class=org.apache.solr.highlight.MultiColoredScoreOrderFragmentsBuilder default=false/ Maybe I am missing something... any idea? Using the doHighlightingByHighlighter highlight works perfect. **I also have noticed that using snippet fragment size to 0 (wich in normal highlight returns the whole field highlighted) gives an error. Koji Sekiguchi-2 wrote: Marc Sturlese wrote: How do I activate FastVectorHighlighter in trunk? Wich of those params sets it up? !-- Configure the standard fragListBuilder -- fragListBuilder name=simple class=org.apache.solr.highlight.SimpleFragListBuilder default=true/ !-- Configure the standard fragmentsBuilder -- fragmentsBuilder name=colored class=org.apache.solr.highlight.MultiColoredScoreOrderFragmentsBuilder default=true/ fragmentsBuilder name=scoreOrder class=org.apache.solr.highlight.ScoreOrderFragmentsBuilder default=true/ Thanks in advance. You do not need to activate it. DefaultSolrHighlighter, which is the default SolrHighlighter impl, calls automatically uses FVH when you specify field names that are termVectors, termPositions and termOffsets are true through hl.fl parameter. If you want to use multi colored tag feature, you need to specify MultiColored*FragmentsBuilder in solrconfig.xml. Koji -- http://www.rondhuit.com/en/
[jira] Resolved: (SOLR-1737) Add a FieldStreamDataSource
[ https://issues.apache.org/jira/browse/SOLR-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-1737. -- Resolution: Fixed committed r:903966 Add a FieldStreamDataSource --- Key: SOLR-1737 URL: https://issues.apache.org/jira/browse/SOLR-1737 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Fix For: 1.5 Attachments: SOLR-1737.patch TikaEntityProcessor needs a DataSource which returns a Stream instead of a Reader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.