Hi Alexandre, The DIH is executed correctly and the tokenized representation is obtained correctly, but the URP chain is not executed with the call: http://localhost:8983/solr/reed_jobs/update/details?commit=true Isn't it the correct URL? is there any parameter missing? Best, Roxana
On 22 October 2015 at 16:17, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > Well, I guess I imagined three steps: > 1) Run DIH > 2) Get the tokenized representation for each document using facets or > other approaches > 3) Submit document partial-update request with additional custom > processing through URP > > Your example seems to be skipping step 2, so the URP chain does not > know which documents to actually work on and is basically an empty > call. > > Again, I suspect knowing the business objectives may bring other > solutions to the front. > > Regards, > Alex. > ---- > Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > http://www.solr-start.com/ > > > On 22 October 2015 at 10:49, Roxana Danger > <roxana.dan...@reedonline.co.uk> wrote: > > Hi Alex, > > > > My idea behind this is avoid two calls: first, the importer and after the > > updater. As there is an update processor chain than can be used after the > > DIH, I thorough it was possible to get a real-time updater. > > > > So, I am getting your advice and dividing the process in different > steps. I > > have the following configuration: > > > > <updateRequestProcessorChain name="retrieveDetails"> > > <processor class="MyUpdater1"/> > > <processor class="MyUpdater2"/> > > <processor class="solr.LogUpdateProcessorFactory" /> > > <processor class="solr.RunUpdateProcessorFactory" /> > > </updateRequestProcessorChain> > > > > <requestHandler name="/update/details" class="solr.UpdateRequestHandler"> > > <lst name="defaults"> > > <str name="update.chain">retrieveDetails</str> > > </lst> > > </requestHandler> > > > > <requestHandler name="/dataimport" > > class="org.apache.solr.handler.dataimport.DataImportHandler"> > > <lst name="defaults"> > > <str name="config">db-data-config.xml</str> > > <!-- <str name="update.chain">retrieveDetails</str> --> > > </lst> > > </requestHandler> > > > > So, after import (notice it does not contains the updtate.chain). I have > > try to run the update with the following request: > > http://localhost:8983/solr/reed_jobs/update/details?commit=true > > but it returns immediately with status 0 but does not execute the > update... > > How should the update be called for reindex/update all the imported docs. > > with my chain? > > > > > > Best regards, > > Roxana > > > > > > On 22 October 2015 at 14:14, Alexandre Rafalovitch <arafa...@gmail.com> > > wrote: > > > >> You are doing things out of order. It's DIH, URP, then indexer. Any > >> attempt to subvert that order for the record being indexed will end in > >> problems. > >> > >> Have you considered doing a dual path? Index, then update. Of course, > >> your fields all need to be stored for that. > >> > >> Also, perhaps you need to rethink the problem on a higher level. If > >> all you need to do is to extract tokenized content of a field during > >> search, you can do that in several ways, such as faceting on that > >> field, or - I believe - using terms end-point. > >> > >> Regards, > >> Alex. > >> ---- > >> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: > >> http://www.solr-start.com/ > >> > >> > >> On 22 October 2015 at 06:20, Roxana Danger > >> <roxana.dan...@reedonline.co.uk> wrote: > >> > Hello, > >> > > >> > I would like to create an updateRequestProcessorChain that should to > be > >> > executed after a DB DIH. I am extending UpdateRequestProcessorFactory > and > >> > the UpdateRequestProcessor classes. The method processAdd of my > >> > UpdateRequestProcessor should be able to update the documents with > the > >> > indexed terms associated to a field. Notice that these terms should > have > >> > been extracted with an analyzer before my updateRequestProcessorChain > >> > processor begins to execute. > >> > > >> > The problem I am getting is that at the point where processAdd is > >> executed > >> > the field containing the terms has not been filled. To retrieve the > >> terms I > >> > am using the SolrIndexSearcher provided during the request > >> > (req.getSearcher()). However, it seems that this searcher uses only > the > >> > data physically stored and does not consider any of the imported data. > >> > > >> > Any idea on how can I access to searcher with all indexed/cached data > >> when > >> > the processAdd method is executed? > >> > > >> > Thank you very much in advance. > >> > > > > > > > > -- > > Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, > London, > > WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> > The > > UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter] > > <https://twitter.com/reedcouk> > > <https://www.linkedin.com/company/reed.co.uk> [image: > > Like us on Facebook] <https://www.facebook.com/reedcouk/> > > <https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays > » > > <http://www.reed.co.uk/lovemondays> > -- Roxana Danger | Data Scientist Dragon Court, 27-29 Macklin Street, London, WC2B 5LX Tel: 020 7067 4568 [image: reed.co.uk] <http://www.reed.co.uk/> The UK's #1 job site. <http://www.reed.co.uk/> [image: Follow us on Twitter] <https://twitter.com/reedcouk> <https://www.linkedin.com/company/reed.co.uk> [image: Like us on Facebook] <https://www.facebook.com/reedcouk/> <https://plus.google.com/u/0/+reedcouk/posts> It's time to Love Mondays » <http://www.reed.co.uk/lovemondays>