from:"Aman Tandon"

Re: Mismatch between replication API & index.properties

2019-07-31 Thread Aman Tandon

Yes, that is what my understanding is but if you see the Replication
handler response it is saying it is referring to the index folder not to
the one shown in index.properties. Due to that confusion I am not able to
delete the folder.

Is this some bug or default behavior where irrespective of the
index.properties it will always shows the index folder only.

Solr version - 6.6.2

On Wed, Jul 31, 2019, 21:17 jai dutt  wrote:

> It's correct behaviour , Solr put replica index file in this format only
> and you can find latest index pointing in index.properties file. Usually
> afer successful full replication Solr remove old timestamp dir.
>
> On Wed, 31 Jul, 2019, 8:02 PM Aman Tandon, 
> wrote:
>
> > Hi,
> >
> > We are having a situation where whole disk space is full and in server
> > where we are seeing the multiple index directories ending with the
> > timestamp. Upon checking the index.properties file for a particular shard
> > replica, it is not referring to the folder name *index *but when I am
> using
> > the replication API I am seeing it is pointing to *index *folder. Am I
> > missing something? Kindly advise.
> >
> > *directory*
> >
> >
> >
> > *drwxrwxr-x. 2 fusion fusion 69632 Jul 30 23:24 indexdrwxrwxr-x. 2 fusion
> > fusion 28672 Jul 31 03:02 index.20190731005047763drwxrwxr-x. 2 fusion
> > fusion  4096 Jul 31 10:20 index.20190731095757917*
> > -rw-rw-r--. 1 fusion fusion78  Jul 31 03:02 index.properties
> > -rw-rw-r--. 1 fusion fusion   296 Jul 31 09:56 replication.properties
> > drwxrwxr-x. 2 fusion fusion  4096 Jan 16  2019 snapshot_metadata
> > drwxrwxr-x. 2 fusion fusion  4096 Jul 30 23:24 tlog
> >
> > *index.properties*
> >
> > #index.properties
> > #Wed Jul 31 03:02:12 EDT 2019
> > index=index.20190731005047763
> >
> > *REPLICATION API STATUS*
> >
> > 
> > 280.56 GB
> > 
> > */opt/solr/x_shard4_replica3/data/index/*
> > 
> > ...
> > true
> > false
> > 1564543395563
> > 98884
> > ...
> > ...
> >
> > Regards,
> > Aman
> >
>

Mismatch between replication API & index.properties

2019-07-31 Thread Aman Tandon

Hi,

We are having a situation where whole disk space is full and in server
where we are seeing the multiple index directories ending with the
timestamp. Upon checking the index.properties file for a particular shard
replica, it is not referring to the folder name *index *but when I am using
the replication API I am seeing it is pointing to *index *folder. Am I
missing something? Kindly advise.

*directory*



*drwxrwxr-x. 2 fusion fusion 69632 Jul 30 23:24 indexdrwxrwxr-x. 2 fusion
fusion 28672 Jul 31 03:02 index.20190731005047763drwxrwxr-x. 2 fusion
fusion  4096 Jul 31 10:20 index.20190731095757917*
-rw-rw-r--. 1 fusion fusion78  Jul 31 03:02 index.properties
-rw-rw-r--. 1 fusion fusion   296 Jul 31 09:56 replication.properties
drwxrwxr-x. 2 fusion fusion  4096 Jan 16  2019 snapshot_metadata
drwxrwxr-x. 2 fusion fusion  4096 Jul 30 23:24 tlog

*index.properties*

#index.properties
#Wed Jul 31 03:02:12 EDT 2019
index=index.20190731005047763

*REPLICATION API STATUS*


280.56 GB

*/opt/solr/x_shard4_replica3/data/index/*

...
true
false
1564543395563
98884
...
...

Regards,
Aman

Re: Solr relevancy score different on replicated nodes

2019-02-12 Thread Aman Tandon

Thanks Erick for your suggestions and time.

On Tue, Feb 12, 2019, 22:32 Erick Erickson  You really only have four
> 1> use exactstats. This won't guarantee precise matches, but they'll be
> closer
> 2> optimize (not particularly recommended, but if you're willing to do
> it periodically it'll have the stats match until the next updates).
> 3> use TLOG/PULL replicas and confine the requests to the PULL
> replicas. There'll _still_ be some window for mismatches,
> specifically the default is commit_interval/2
> 4> define the problem away.
>
> Best,
> Erick
>
> On Tue, Feb 12, 2019 at 2:42 AM Aman Tandon 
> wrote:
> >
> > Hi Erick,
> >
> > Any suggestions on this?
> >
> > Regards,
> > Aman
> >
> > On Fri, Feb 8, 2019, 17:07 Aman Tandon  >
> > > Hi Erick,
> > >
> > > I find this thread very relevant to the people who are facing the same
> > > problem.
> > >
> > > In our case, we have a signals aggregation collection which is having
> > > total of around 8 million records. We have Solr cloud architecture(3
> shards
> > > and 4 replicas) and the whole size of index is of around 2.5 GB.
> > >
> > > We use this collection to fetch the most clicked products against a
> query
> > > and boost in search results. Boost score is the query score on
> aggregation
> > > collection.
> > >
> > > But when the query goes to different replica we get different boost
> score
> > > for some of the keywords, hence on page refresh results ordering keep
> on
> > > changing.
> > >
> > > In order to solve we tried the exactstats cache for distributed IDF
> and on
> > > debug level I am seeing global stats merge in logs but still the
> different
> > > scores coming on refreshing the results from aggregation collection.
> > >
> > > Our indexing occur once a day so should we do daily optimization or
> should
> > > we reduce merge segment count to 2/3 currently it is -1.
> > >
> > > What are your suggestions on this?
> > >
> > > Regards,
> > > Aman
> > >
> > > On Fri, Feb 8, 2019, 00:15 Erick Erickson  wrote:
> > >
> > >> Optimization is safe. The large segment is irrelevant, you'll
> > >> lose a little parallelization, but on an index with this few
> > >> documents I doubt you'll notice.
> > >>
> > >> As of Solr 5, optimize will respect the max segment size
> > >> which defaults to 5G, but you're well under that limit.
> > >>
> > >> Best,
> > >> Erick
> > >>
> > >> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht  >
> > >> wrote:
> > >> >
> > >> > Thanks Erick and everyone.We are checking on stats cache.
> > >> >
> > >> > I noticed stats skew again and optimized the index to correct the
> > >> same.As
> > >> > per the documents.
> > >> >
> > >> >
> > >>
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
> > >> > and
> > >> >
> > >>
> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
> > >> >
> > >> > wanted to check on below points considering we want stats skew to be
> > >> > corrected.
> > >> >
> > >> > 1.When optimized single segment won't be natural merged easily.As we
> > >> might
> > >> > be doing manual optimize every time,what I visualize is at a certain
> > >> point
> > >> > in future we might be having a single large segment.What impact this
> > >> large
> > >> > segment is going to have?
> > >> > Our index ~30k documents i.e files with content(Segment size <1Gb
> as of
> > >> now)
> > >> >
> > >> > 1.Do you recommend going for optimize in these situations?Probably
> it
> > >> will
> > >> > be done only when stats skew.Is it safe?
> > >> >
> > >> > Regards
> > >> > Ashish
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Sent from:
> http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> > >>
> > >
>

Re: Solr relevancy score different on replicated nodes

2019-02-12 Thread Aman Tandon

Hi Erick,

Any suggestions on this?

Regards,
Aman

On Fri, Feb 8, 2019, 17:07 Aman Tandon  Hi Erick,
>
> I find this thread very relevant to the people who are facing the same
> problem.
>
> In our case, we have a signals aggregation collection which is having
> total of around 8 million records. We have Solr cloud architecture(3 shards
> and 4 replicas) and the whole size of index is of around 2.5 GB.
>
> We use this collection to fetch the most clicked products against a query
> and boost in search results. Boost score is the query score on aggregation
> collection.
>
> But when the query goes to different replica we get different boost score
> for some of the keywords, hence on page refresh results ordering keep on
> changing.
>
> In order to solve we tried the exactstats cache for distributed IDF and on
> debug level I am seeing global stats merge in logs but still the different
> scores coming on refreshing the results from aggregation collection.
>
> Our indexing occur once a day so should we do daily optimization or should
> we reduce merge segment count to 2/3 currently it is -1.
>
> What are your suggestions on this?
>
> Regards,
> Aman
>
> On Fri, Feb 8, 2019, 00:15 Erick Erickson 
>> Optimization is safe. The large segment is irrelevant, you'll
>> lose a little parallelization, but on an index with this few
>> documents I doubt you'll notice.
>>
>> As of Solr 5, optimize will respect the max segment size
>> which defaults to 5G, but you're well under that limit.
>>
>> Best,
>> Erick
>>
>> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht 
>> wrote:
>> >
>> > Thanks Erick and everyone.We are checking on stats cache.
>> >
>> > I noticed stats skew again and optimized the index to correct the
>> same.As
>> > per the documents.
>> >
>> >
>> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
>> > and
>> >
>> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
>> >
>> > wanted to check on below points considering we want stats skew to be
>> > corrected.
>> >
>> > 1.When optimized single segment won't be natural merged easily.As we
>> might
>> > be doing manual optimize every time,what I visualize is at a certain
>> point
>> > in future we might be having a single large segment.What impact this
>> large
>> > segment is going to have?
>> > Our index ~30k documents i.e files with content(Segment size <1Gb as of
>> now)
>> >
>> > 1.Do you recommend going for optimize in these situations?Probably it
>> will
>> > be done only when stats skew.Is it safe?
>> >
>> > Regards
>> > Ashish
>> >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>
>

Re: Solr relevancy score different on replicated nodes

2019-02-08 Thread Aman Tandon

Hi Erick,

I find this thread very relevant to the people who are facing the same
problem.

In our case, we have a signals aggregation collection which is having total
of around 8 million records. We have Solr cloud architecture(3 shards and 4
replicas) and the whole size of index is of around 2.5 GB.

We use this collection to fetch the most clicked products against a query
and boost in search results. Boost score is the query score on aggregation
collection.

But when the query goes to different replica we get different boost score
for some of the keywords, hence on page refresh results ordering keep on
changing.

In order to solve we tried the exactstats cache for distributed IDF and on
debug level I am seeing global stats merge in logs but still the different
scores coming on refreshing the results from aggregation collection.

Our indexing occur once a day so should we do daily optimization or should
we reduce merge segment count to 2/3 currently it is -1.

What are your suggestions on this?

Regards,
Aman

On Fri, Feb 8, 2019, 00:15 Erick Erickson  Optimization is safe. The large segment is irrelevant, you'll
> lose a little parallelization, but on an index with this few
> documents I doubt you'll notice.
>
> As of Solr 5, optimize will respect the max segment size
> which defaults to 5G, but you're well under that limit.
>
> Best,
> Erick
>
> On Sun, Feb 3, 2019 at 11:54 PM Ashish Bisht 
> wrote:
> >
> > Thanks Erick and everyone.We are checking on stats cache.
> >
> > I noticed stats skew again and optimized the index to correct the same.As
> > per the documents.
> >
> >
> https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
> > and
> >
> https://lucidworks.com/2018/06/20/solr-and-optimizing-your-index-take-ii/
> >
> > wanted to check on below points considering we want stats skew to be
> > corrected.
> >
> > 1.When optimized single segment won't be natural merged easily.As we
> might
> > be doing manual optimize every time,what I visualize is at a certain
> point
> > in future we might be having a single large segment.What impact this
> large
> > segment is going to have?
> > Our index ~30k documents i.e files with content(Segment size <1Gb as of
> now)
> >
> > 1.Do you recommend going for optimize in these situations?Probably it
> will
> > be done only when stats skew.Is it safe?
> >
> > Regards
> > Ashish
> >
> >
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: Get MLT Interesting Terms for a set of documents corresponding to the query specified

2019-01-21 Thread Aman Tandon

I see two rows params, looks like which will be overwritten by rows=2, and
then your tags filter is resulting only one document. Please remove extra
rows and try.

On Mon, Jan 21, 2019, 08:44 Pratik Patel  Hi Everyone!
>
> I am trying to use MLT request handler. My query matches more than one
> documents but the response always seems to pick up the first document and
> interestingTerms also seems to be corresponding to that single document
> only.
>
> What I am expecting is that if my query matches multiple documents then the
> InterestingTerms handler result also corresponds to that set of documents
> and not the first document.
>
> Following is my query,
>
>
> http://localhost:8081/solr/collection1/mlt?debugQuery=on=tags:test=true=mlt.fl=textpropertymlt=details=1=2=3=*:*=100=2=0
>
> Ultimately, my goal is to get interesting terms corresponding to this whole
> set of documents. I don't need similar documents as such. If not with mlt,
> is there any other way I can achieve this? that is, given a query matching
> set of documents, find interestingTerms for that set of documents based on
> tf-idf?
>
> Thanks!
> Pratik
>

Re: solr-query

2019-01-20 Thread Aman Tandon

Hi Shilpa,

I am assuming you know the functionality of synonym.

Synonym in Solr can be applied over the tokens getting indexed/queried for
the field. In order to apply synonym to a field you need to update the
configuration file schema.xml where you also define a file (synonym.txt is
default, you can create per field separate file as well) which is keeping
synonym for your business requirement.

You can also define synonym to apply at index or query time or both for a
field. However if you applying at index time, then any new synonym addition
to synonym file, require to reindex the whole collection.

You could read more about synonym at here.

https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-SynonymGraphFilter

Also a good blogs on multi word synonym.

https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/


On Fri, Jan 18, 2019, 21:44 Shilpa Solanki  Hello,
>
>  can you ask me how we use synonyms with apache solr.
>
>
> Thanks & Regards,
> Shilpa solanki
>

Re: Zookeeper timeout issue -

2018-12-28 Thread Aman Tandon

As Jan mentioned also see GC activity or memory issues, also check out for
the threads by looking if any thread pending/waiting too long.

On Fri, Dec 28, 2018, 16:14 AshB  Hi Dominique,
>
> Yes,we are load testing with 50 users.We tried changing the timeout but its
> not reflecting.
>
> Regards
> Ashish
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>

Re: REBALANCELEADERS is not reliable

2018-11-29 Thread Aman Tandon

++ correction

On Fri, Nov 30, 2018, 01:10 Aman Tandon  For me today, I deleted the leader replica of one of the two shard
> collection. Then other replicas of that shard wasn't getting elected for
> leader.
>
> After waiting for long tried the setting addreplicaprop preferred leader
> on one of the replica then tried FORCELEADER but no luck. Then also tried
> rebalance but no help. Finally have to recreate the whole collection.
>
> Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
> work if there was no leader however preferred leader property was setted.
>
> On Wed, Nov 28, 2018, 12:54 Bernd Fehling  wrote:
>
>> Hi Vadim,
>>
>> thanks for confirming.
>> So it seems to be a general problem with Solr 6.x, 7.x and might
>> be still there in the most recent versions.
>>
>> But where to start to debug this problem, is it something not
>> correctly stored in zookeeper or is overseer the problem?
>>
>> I was also reading something about a "leader queue" where possible
>> leaders have to be requeued or something similar.
>>
>> May be I should try to get a situation where a "locked" core
>> is on the overseer and then connect the debugger to it and step
>> through it.
>> Peeking and poking around, like old Commodore 64 days :-)
>>
>> Regards, Bernd
>>
>>
>> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
>> > Hi, Bernd
>> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
>> > I had very similar results and notion that it's not reliable :(
>> > --
>> > Br, Vadim
>> >
>> >> -Original Message-
>> >> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
>> >> Sent: Tuesday, November 27, 2018 5:13 PM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: REBALANCELEADERS is not reliable
>> >>
>> >> Hi list,
>> >>
>> >> unfortunately REBALANCELEADERS is not reliable and the leader
>> >> election has unpredictable results with SolrCloud 6.6.5 and
>> >> Zookeeper 3.4.10.
>> >> Seen with 5 shards / 3 replicas.
>> >>
>> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
>> >> - setting with ADDREPLICAPROP the property preferredLeader to other
>> replicas
>> >> - calling REBALANCELEADERS
>> >> - some leaders have changed, some not.
>> >>
>> >> I then tried:
>> >> - removing all preferredLeader properties from replicas which
>> succeeded.
>> >> - trying again REBALANCELEADERS for the rest. No success.
>> >> - Shutting down nodes to force the leader to a specific replica left
>> running.
>> >>No success.
>> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
>> >> - calling CLUSTERSTATUS reports that the replica is active!!!
>> >>
>> >> Also, the replica which don't want to become leader is not in the list
>> >> of collections->[collection_name]->leader_elect->shard1..x->election
>> >>
>> >> Where is CLUSTERSTATUS getting it's state info from?
>> >>
>> >> Has anyone else problems with REBALANCELEADERS?
>> >>
>> >> I noticed that the Reference Guide writes "preferredLeader" (with
>> capital "L")
>> >> but the JAVA code has "preferredleader".
>> >>
>> >> Regards, Bernd
>> >
>>
>

Re: REBALANCELEADERS is not reliable

2018-11-29 Thread Aman Tandon

For me today, I deleted the leader replica of one of the two shard
collection. Then other replica of that shard was getting elected for leader.

After waiting for long tried the setting addreplicaprop preferred leader on
one of the replica then tried FORCELEADER but no luck. Then also tried
rebalance but no help. Finally have to recreate the whole collection.

Not sure what was the issue but both FORCELEADER AND REBALANCING didn't
work if there was no leader however preferred leader property was setted.

On Wed, Nov 28, 2018, 12:54 Bernd Fehling  Hi Vadim,
>
> thanks for confirming.
> So it seems to be a general problem with Solr 6.x, 7.x and might
> be still there in the most recent versions.
>
> But where to start to debug this problem, is it something not
> correctly stored in zookeeper or is overseer the problem?
>
> I was also reading something about a "leader queue" where possible
> leaders have to be requeued or something similar.
>
> May be I should try to get a situation where a "locked" core
> is on the overseer and then connect the debugger to it and step
> through it.
> Peeking and poking around, like old Commodore 64 days :-)
>
> Regards, Bernd
>
>
> Am 27.11.18 um 15:47 schrieb Vadim Ivanov:
> > Hi, Bernd
> > I have tried REBALANCELEADERS with Solr 6.3 and 7.5
> > I had very similar results and notion that it's not reliable :(
> > --
> > Br, Vadim
> >
> >> -Original Message-
> >> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
> >> Sent: Tuesday, November 27, 2018 5:13 PM
> >> To: solr-user@lucene.apache.org
> >> Subject: REBALANCELEADERS is not reliable
> >>
> >> Hi list,
> >>
> >> unfortunately REBALANCELEADERS is not reliable and the leader
> >> election has unpredictable results with SolrCloud 6.6.5 and
> >> Zookeeper 3.4.10.
> >> Seen with 5 shards / 3 replicas.
> >>
> >> - CLUSTERSTATUS reports all replicas (core_nodes) as state=active.
> >> - setting with ADDREPLICAPROP the property preferredLeader to other
> replicas
> >> - calling REBALANCELEADERS
> >> - some leaders have changed, some not.
> >>
> >> I then tried:
> >> - removing all preferredLeader properties from replicas which succeeded.
> >> - trying again REBALANCELEADERS for the rest. No success.
> >> - Shutting down nodes to force the leader to a specific replica left
> running.
> >>No success.
> >> - calling REBALANCELEADERS responds that the replica is inactive!!!
> >> - calling CLUSTERSTATUS reports that the replica is active!!!
> >>
> >> Also, the replica which don't want to become leader is not in the list
> >> of collections->[collection_name]->leader_elect->shard1..x->election
> >>
> >> Where is CLUSTERSTATUS getting it's state info from?
> >>
> >> Has anyone else problems with REBALANCELEADERS?
> >>
> >> I noticed that the Reference Guide writes "preferredLeader" (with
> capital "L")
> >> but the JAVA code has "preferredleader".
> >>
> >> Regards, Bernd
> >
>

Re: SOLR Partial search

2018-10-10 Thread Aman Tandon

Hi Piyush,

I suppose your end goal is to search special chars too and I hope you are
using it typeahead.

Keyword tokenizer keep the complete string as token. So when you search
with partial it won't match.

You could add the n-gram filter. Then output of keyword tokenizer will be
broken in configured grams and that might help you.

Please give it a try and let us know.

Regards,
Aman


On Mon, Oct 8, 2018, 10:24 Rathor, Piyush (US - Philadelphia) <
prat...@deloitte.com> wrote:

> HI All,
>
>
>
> I am trying to use “KeywordTokenizerFactory” to consider searching against
> the special characters in the search.
>
> But the partial search does not work well with “KeywordTokenizerFactory”.
>
>
>
> The partial match results are better in “StandardTokenizerFactory”.
>
>
>
> Field type – text_general
>
>
>
> Example for both scenarios :
>
> Partial search parameter: Nah'
>
> Expected result on top : Nah’bir
>
>
>
> Partial Search : shar
>
> Full Name : Sharma
>
>
>
> Please let me know if there is something that can be done to cater both
> special characters and partial matches together.
>
>
>
> Thanks & Regards
>
> Piyush R
>
>
>
> This message (including any attachments) contains confidential information
> intended for a specific individual and purpose, and is protected by law. If
> you are not the intended recipient, you should delete this message and any
> disclosure, copying, or distribution of this message, or the taking of any
> action based on it, by you is strictly prohibited.
>
> v.E.1
>

Re: deleted master index files replica did not replicate

2018-06-04 Thread Aman Tandon

Hi Jeff,

I suppose there should be slave configuration in solrconfig files which
says to ping master to check for the version and get the modified files.

If replication is configured in slave you will see commands getting
triggered and you could get some idea from there.

Also you could paste that log if it not clear.

Regards,
Aman

On Mon, Jun 4, 2018, 23:57 Jeff Courtade  wrote:

> To be clear I deleted the actual index files out from under the running
> master
>
> On Mon, Jun 4, 2018, 2:25 PM Jeff Courtade  wrote:
>
> > So are you saying it should have?
> >
> > It really acted like a normal function this happened on 5 different pairs
> > in the same way.
> >
> >
> > On Mon, Jun 4, 2018, 2:23 PM Aman Tandon 
> wrote:
> >
> >> Could you please check the replication request commands in solr logs of
> >> slave and see if it is complaining anything.
> >>
> >> On Mon, Jun 4, 2018, 23:45 Jeff Courtade 
> wrote:
> >>
> >> > Hi,
> >> >
> >> > This I think is a very simple question.
> >> >
> >> > I have a solr 4.3 master slave setup.
> >> >
> >> > Simple replication.
> >> >
> >> > The master and slave were both running and synchronized up to date
> >> >
> >> > I went on the master and deleted the index files while solr was
> running.
> >> > solr created new empty index files and continued to serve requests.
> >> > The slave did not delete its indexes and kept all of the old data in
> >> place
> >> > and continued to serve requests.
> >> >
> >> > This was strange as I would have thought the replica would have
> >> replicated
> >> > an empty index from the master.
> >> >
> >> > Does anyone have an explanation for this? I am fairly certain I just
> am
> >> not
> >> > understanding something basic.
> >> >
> >> > J
> >> >
> >> > --
> >> >
> >> > Jeff Courtade
> >> > M: 240.507.6116
> >> >
> >>
> > --
> >
> > Jeff Courtade
> > M: 240.507.6116
> >
> --
>
> Jeff Courtade
> M: 240.507.6116
>

Re: UUIDUpdateProcessorFactory can cause duplicate documents?

2018-06-04 Thread Aman Tandon

Hi,

Suppose id field is the UUID linked field in the configuration and if this
is missing in the document coming to index then it will generate a UUID and
set it in id field. However if id field is present with some value then it
shouldn't.

Kindly refer
http://lucene.apache.org/solr/5_5_0/solr-core/org/apache/solr/update/processor/UUIDUpdateProcessorFactory.html

On Mon, Jun 4, 2018, 23:52 S G  wrote:

> Hi,
>
> Is it correct to assume that UUIDUpdateProcessorFactory will produce 2
> documents even if the same document is indexed twice without the "id" field
> ?
>
> And to avoid such a thing, we can use the technique mentioned in
> https://wiki.apache.org/solr/Deduplication ?
>
> Thanks
> SG
>

Re: deleted master index files replica did not replicate

2018-06-04 Thread Aman Tandon

Could you please check the replication request commands in solr logs of
slave and see if it is complaining anything.

On Mon, Jun 4, 2018, 23:45 Jeff Courtade  wrote:

> Hi,
>
> This I think is a very simple question.
>
> I have a solr 4.3 master slave setup.
>
> Simple replication.
>
> The master and slave were both running and synchronized up to date
>
> I went on the master and deleted the index files while solr was running.
> solr created new empty index files and continued to serve requests.
> The slave did not delete its indexes and kept all of the old data in place
> and continued to serve requests.
>
> This was strange as I would have thought the replica would have replicated
> an empty index from the master.
>
> Does anyone have an explanation for this? I am fairly certain I just am not
> understanding something basic.
>
> J
>
> --
>
> Jeff Courtade
> M: 240.507.6116
>

Re: Solr score use cases

2017-12-01 Thread Aman Tandon

Hi Faraz,

Solr score which you could retrieved by adding in fl parameter could be
helpful to understand the following:

1) search relevance ranking: how much score solr has given to the top &
second top document, and with debug=true you could better understand what
is causing that score.

2) You could use the function query to multiply score with some feature
e.g. paid customers score, popularity score, etc to improve the relevance
as per the business.

I am able to think these few points only, someone can also put more light
if I am missing anything. I hope this is what you want to know. 

Regards,
Aman

On Dec 1, 2017 13:38, "Faraz Fallahi"  wrote:

Hi

A simple question: what are the most common use cases for the solr score of
documents retrieved after firing queries?
I dont have a real understanding of its purpose at the moment.

Thx for helping

Range facet over currency field

2017-12-01 Thread Aman Tandon

Hi,

I have a doubt regarding how to do the range facet on some different
currency.

I have indexed the price data in USD inside the field price_usd_c and I
have currency.xml which is getting generated by a process.

If I want to do the range facet on the field price_usd_c in Euro currency,
then how could I do it and what is the syntax of it. Is there any way to do
so? If so kindly help.

Regards,
Aman

Re: How to build solr

2017-09-21 Thread Aman Tandon

Hi Srini,

Kindly refer to the READ.ME section of this link of GitHub, this should
work.
https://github.com/apache/lucene-solr/blob/master/README.md

With regards,
Aman Tandon

On Sep 21, 2017 1:53 PM, "srini sampath" <sampathsrini.c...@gmail.com>
wrote:

> Hi,
> How to build and compile solr in my locale machine? it seems the
> https://wiki.apache.org/solr/HowToCompileSolr page became obsolete.
> Thanks in advance
>

Re: Provide suggestion on indexing performance

2017-09-17 Thread Aman Tandon

Hi Shawn,

Thanks for your reply, this is really helpful. I will try this out to see
the performance with the docValues.

With regards,
Aman Tandon

On Sep 15, 2017 9:10 PM, "Shawn Heisey" <apa...@elyograg.org> wrote:

> On 9/11/2017 9:06 PM, Aman Tandon wrote:
> > We want to know about the indexing performance in the below mentioned
> > scenarios, consider the total number of 10 string fields and total number
> > of documents are 10 million.
> >
> > 1) indexed=true, stored=true
> > 2) indexed=true, docValues=true
> >
> > Which one should we prefer in terms of indexing performance, please share
> > your experience.
>
> There are several settings in the schema for each field, things like
> indexed, stored, docValues, multiValued, and others.  You should base
> your choices on what you need Solr to do.  Choosing these settings based
> purely on desired indexing speed may result in Solr not doing what you
> want it to do.
>
> When the indexing system sends data to Solr with several threads or
> processes, Solr is *usually* capable of indexing data faster than most
> systems can supply it.  The more settings you disable on a field, the
> faster Solr will be able to index.
>
> It is not possible to provide precise numbers, because performance
> depends on many factors, some of which you may not even know until you
> build a production system.
>
> https://lucidworks.com/sizing-hardware-in-the-abstract-why-
> we-dont-have-a-definitive-answer/
>
> All that said ... docValues MIGHT be a little bit faster than stored,
> because stored data is compressed, and the compression takes CPU time.
> On a fully populated production system, that statement might turn out to
> be wrong.  There may be factors that result in stored fields working
> better.  The best way to decide is to try it both ways with all your data.
>
> Thanks,
> Shawn
>
>

Re: Provide suggestion on indexing performance

2017-09-17 Thread Aman Tandon

Hi Tom,

Thanks for your suggestion and the information.

I will try this out to test and will share the results.

On Sep 14, 2017 2:32 PM, "Sreenivas.T" <sree...@gmail.com> wrote:

> I agree with Tom. Doc values and stored fields are present for different
> reasons. Doc values is another index that gets build for faster
> sorting/faceting.
>
> On Wed, Sep 13, 2017 at 11:30 PM Tom Evans <tevans...@googlemail.com>
> wrote:
>
> > On Tue, Sep 12, 2017 at 4:06 AM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> > > Hi,
> > >
> > > We want to know about the indexing performance in the below mentioned
> > > scenarios, consider the total number of 10 string fields and total
> number
> > > of documents are 10 million.
> > >
> > > 1) indexed=true, stored=true
> > > 2) indexed=true, docValues=true
> > >
> > > Which one should we prefer in terms of indexing performance, please
> share
> > > your experience.
> > >
> > > With regards,
> > > Aman Tandon
> >
> > Your question doesn't make much sense. You turn on stored when you
> > need to retrieve the original contents of the fields after searching,
> > and you use docvalues to speed up faceting, sorting and grouping.
> > Using docvalues to retrieve values during search is more expensive
> > than simply using stored values, so if your primary aim is retrieving
> > stored values, use stored=true.
> >
> > Secondly, the only way to answer performance questions for your schema
> > and data is to try it out. Generate 10 million docs, store them in a
> > doc (eg as CSV), and then use the post tool to try different schema
> > and query options.
> >
> > Cheers
> >
> > Tom
> >
>

Provide suggestion on indexing performance

2017-09-11 Thread Aman Tandon

Hi,

We want to know about the indexing performance in the below mentioned
scenarios, consider the total number of 10 string fields and total number
of documents are 10 million.

1) indexed=true, stored=true
2) indexed=true, docValues=true

Which one should we prefer in terms of indexing performance, please share
your experience.

With regards,
Aman Tandon

Re: Problems retrieving large documents

2017-08-02 Thread Aman Tandon

Did you find any error in Solr logs?

On Sat, Jul 29, 2017, 23:13 Aman Tandon <amantandon...@gmail.com> wrote:

> Hello,
>
> Kindly check the Solr logs when you are hitting the query. Attach the same
> here, that I could gave more insight.
>
> For me it looks like the OOM, but check the Solr logs I hope we could get
> more information from there.
>
> On Sat, Jul 29, 2017, 14:35 SOLR6932 <lbarlet...@gmail.com> wrote:
>
>> Hey all,
>> I am using Solr 4.10.3 and my collection consists around 2300 large
>> documents that are distributed across a number of shards. Each document is
>> estimated to be around 50-70 megabytes. The queries that I run are
>> sophisticated, involve a range of parameters and diverse query filters.
>> Whenever I wish to retrieve all the returned document fields (fl:* [around
>> 50 fields in my schema]), I receive an impossible exception - specifically
>> /org.apache.solr.common.SolrException: Impossible Exception/ that is
>> logged
>> by both SolrCore and SolrDispachFilter. Has anyone experienced a similar
>> problem and knows how to solve this issue?
>> Thanks in advance,
>> Louie.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Problems-retrieving-large-documents-tp4348169.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>

Re: edismax, pf2 and use of both AND and OR parameter

2017-08-02 Thread Aman Tandon

Hi,

Ideally it should but from the debug query it seems like it is not
respecting Boolean clauses.

Anyone else could help here? Is this the ideal behavior?

On Jul 31, 2017 5:47 PM, "Niraj Aswani" <nirajasw...@gmail.com> wrote:

> Hi Aman,
>
> Thank you very much your reply.
>
> Let me elaborate my question a bit more using your example in this case.
>
> AFAIK, what the pf2 parameter is doing to the query is adding the following
> phrase queries:
>
> (_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail")
>
> There are three phrases being checked here:
> - system memory
> - memory oem
> - oem retail
>
> However, what I actually expected it to look like is the following:
> - system memory
> - memory oem
> - memory retail
>
> My understanding of the edismax parser is that it interprets the AND / OR
> parameters correctly so it should generate the bi-gram phrases respecting
> the AND /OR parameters as well, right?
>
> Am I missing something here?
>
> Regards,
> Niraj
>
> On Mon, Jul 31, 2017 at 4:24 AM, Aman Tandon <amantandon...@gmail.com>
> wrote:
>
> > Hi Niraj,
> >
> > Should I expect it to check the following bigram phrases?
> >
> > Yes it will check.
> >
> > ex- documents & query is given below
> >
> > http://localhost:8983/solr/myfile/select?wt=xml=name;
> > indent=on=*System
> > AND Memory AND (OEM OR Retail)*=50=json&*qf=_text_=_text_*
> > =true=edismax
> >
> > 
> > 
> > 
> > 
> > A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System
> > Memory - OEM
> > 
> > 
> > 
> > 
> > 
> > 
> > CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> > System Memory - Retail
> > 
> > 
> > 
> > 
> > 
> > 
> > CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
> > Dual Channel Kit System Memory - Retail
> > 
> > 
> > 
> > 
> >
> >
> > *Below is the parsed query*
> >
> > 
> > +(+(_text_:system) +(_text_:memory) +((_text_:oem) (_text_:retail)))
> > ((_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail"))
> > 
> >
> > In case if you are in such scenarios where you need to knwo what query
> will
> > form, then you could us the debug=true to know more about the query &
> > timings of different component.
> >
> > *And when the ps2 is not specified default ps will be applied on pf2.*
> >
> > I hope this helps.
> >
> > With Regards
> > Aman Tandon
> >
> > On Mon, Jul 31, 2017 at 4:18 AM, Niraj Aswani <nirajasw...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I am using solr 4.4 and bit confused about how does the edismax parser
> > > treat the pf2 parameter when both the AND and OR operators are used in
> > the
> > > query with ps2=0
> > >
> > > For example:
> > >
> > > pf2=title^100
> > > q=HDMI AND Video AND (Wire OR Cable)
> > >
> > > Should I expect it to check the following bigram phrases?
> > >
> > > hdmi video
> > > video wire
> > > video cable
> > >
> > > Regards
> > > Niraj
> > >
> >
>

Re: edismax, pf2 and use of both AND and OR parameter

2017-07-30 Thread Aman Tandon

Hi Niraj,

Should I expect it to check the following bigram phrases?

Yes it will check.

ex- documents & query is given below

http://localhost:8983/solr/myfile/select?wt=xml=name=on=*System
AND Memory AND (OEM OR Retail)*=50=json&*qf=_text_=_text_*
=true=edismax





A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System
Memory - OEM






CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
System Memory - Retail






CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
Dual Channel Kit System Memory - Retail






*Below is the parsed query*


+(+(_text_:system) +(_text_:memory) +((_text_:oem) (_text_:retail)))
((_text_:"system memory") (_text_:"memory oem") (_text_:"oem retail"))


In case if you are in such scenarios where you need to knwo what query will
form, then you could us the debug=true to know more about the query &
timings of different component.

*And when the ps2 is not specified default ps will be applied on pf2.*

I hope this helps.

With Regards
Aman Tandon

On Mon, Jul 31, 2017 at 4:18 AM, Niraj Aswani <nirajasw...@gmail.com> wrote:

> Hi,
>
> I am using solr 4.4 and bit confused about how does the edismax parser
> treat the pf2 parameter when both the AND and OR operators are used in the
> query with ps2=0
>
> For example:
>
> pf2=title^100
> q=HDMI AND Video AND (Wire OR Cable)
>
> Should I expect it to check the following bigram phrases?
>
> hdmi video
> video wire
> video cable
>
> Regards
> Niraj
>

Re: Problems retrieving large documents

2017-07-29 Thread Aman Tandon

Hello,

Kindly check the Solr logs when you are hitting the query. Attach the same
here, that I could gave more insight.

For me it looks like the OOM, but check the Solr logs I hope we could get
more information from there.

On Sat, Jul 29, 2017, 14:35 SOLR6932  wrote:

> Hey all,
> I am using Solr 4.10.3 and my collection consists around 2300 large
> documents that are distributed across a number of shards. Each document is
> estimated to be around 50-70 megabytes. The queries that I run are
> sophisticated, involve a range of parameters and diverse query filters.
> Whenever I wish to retrieve all the returned document fields (fl:* [around
> 50 fields in my schema]), I receive an impossible exception - specifically
> /org.apache.solr.common.SolrException: Impossible Exception/ that is logged
> by both SolrCore and SolrDispachFilter. Has anyone experienced a similar
> problem and knows how to solve this issue?
> Thanks in advance,
> Louie.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Problems-retrieving-large-documents-tp4348169.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: solr cloud vs standalone solr

2017-07-29 Thread Aman Tandon

Hello Sara,

There is hard n fast rule, performance depends on caches, RAM, hdd etc.and
how much resourced you could invest to keep the acceptable performance.
Information on Number of Indexed documents, number of dynamic fields can be
viewed from the below link. I hope this helps.

http://lucene.472066.n3.nabble.com/Solr-limitations-td4076250.html

On Sat, Jul 29, 2017, 13:23 sara hajili  wrote:

> hi all,
> I want to know when standalone solr can't be sufficient for storing data
> and we need to migrate to solr cloud?for example standalone solr take too
> much time to return query result or to store document or etc.
>
> in other word ,what is best capacity and data index size in  standalone
> solr  that doesn't bad effect on query running and data inserting
> performance?and after passing this index size i must switch to solr cloud?
>

Problem to specify end parameter for range facets

2016-12-16 Thread Aman Tandon

Hi,

I want to do the range facets with gap of 10 and I don't know the end as it
could be a very large value so how could I do that.

Thanks
Aman Tandon

Re: Multilevel sorting in JSON-facet

2016-11-22 Thread Aman Tandon

any help here?

With Regards
Aman Tandon

On Thu, Nov 17, 2016 at 7:16 PM, Wonderful Little Things <
amantandon...@gmail.com> wrote:

> Hi,
>
> I want to do the sorting on multiple fields using the JSON-facet API, so
> is this available? And if it is, then what would be the syntax?
>
> Thanks,
> Aman Tandon
>

Solr Job opportunity - Noida, India

2016-10-24 Thread Aman Tandon

Hi Everyone,

If anyone is interested to apply for noida, India location for Solr
Developer position, then please forward me your resume with the contact
number and email.

*Company Name: Genpact Headstrong Capital Markey*
*Experience required:- 3 - 7 years*

With Regards
Aman Tandon

Re: Help: Lucidwork Fusion documentation

2016-06-02 Thread Aman Tandon

i am looking for lucidwork documentation.

ok chris I will contact lucidwork then.
thank you.

On Friday, June 3, 2016, Chris Hostetter <hossman_luc...@fucit.org> wrote:
>
> Lucidworks Fusion is a commercial product, not a part of the Apache
> Software Foundation - questions about using it are not really appropriate
> for this mailing list.  You should contact Lucidworks support directly...
>
> https://lucidworks.com/company/contact/
>
> ...with that in mind, the docs for Fusion can be found here...
>
> https://doc.lucidworks.com/index.html
>
>
>
> : Date: Fri, 3 Jun 2016 04:40:57 +0530
> : From: Aman Tandon <amantandon...@gmail.com>
> : Reply-To: solr-user@lucene.apache.org
> : To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> : Subject: Help: Lucidwork Fusion documentation
> :
> : Hi,
> :
> : How could I download the Fusion documentation pdf ? If anyone is aware,
> : please help me!!
> :
> : With Regards
> : Aman Tandon
> :
>
> -Hoss
> http://www.lucidworks.com/
>

-- 
Sent from Gmail Mobile

Help: Lucidwork Fusion documentation

2016-06-02 Thread Aman Tandon

Hi,

How could I download the Fusion documentation pdf ? If anyone is aware,
please help me!!

With Regards
Aman Tandon

Re: Configure it on server

2015-11-18 Thread Aman Tandon

Hi Prateek,

Your question is little ambiguous. Could you please describe it more
precisely what you want to configure on server and what is your requirement
and problem. This will be more helpful to understand your problem.

With Regards
Aman Tandon

On Wed, Nov 18, 2015 at 4:29 PM, Prateek Sharma <prateek.sha...@amdocs.com>
wrote:

> Hi,
>
> Can you help me out how I can configure it on a server?
> It was configured on one of our servers but I am unable to replicate it.
>
> Can you please help.
>
> Thanks,
> Prateek
>
> This message and the information contained herein is proprietary and
> confidential and subject to the Amdocs policy statement,
> you may review at http://www.amdocs.com/email_disclaimer.asp
>

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon

Hi,

I tried to use the same as mentioned in the url
<http://stackoverflow.com/questions/16258605/query-for-document-that-two-fields-are-equal>
.

And I used the description field to check because mapping field
is multivalued.

So I add the fq={!frange%20l=0%20u=1}strdist(title,description,edit) in my
url, but I am getting this error. As mentioned below. Please take a look.

*Solr Version 4.8.1*

*Url is*
http://localhost:8150/solr/core1/select?q.alt=*:*=big*,title,catid={!frange%20l=0%20u=1}strdist(title,description,edit)=edismax

> 
> 
> 500
> 8
> 
> *:*
> edismax
> big*,title,catid
> {!frange l=0 u=1}strdist(title,description,edit)
> 
> 
> 
> 
> java.lang.RuntimeException at
> org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration.(ExtendedDismaxQParser.java:1455)
> at
> org.apache.solr.search.ExtendedDismaxQParser.createConfiguration(ExtendedDismaxQParser.java:239)
> at
> org.apache.solr.search.ExtendedDismaxQParser.(ExtendedDismaxQParser.java:108)
> at
> org.apache.solr.search.ExtendedDismaxQParserPlugin.createParser(ExtendedDismaxQParserPlugin.java:37)
> at org.apache.solr.search.QParser.getParser(QParser.java:315) at
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:144)
> at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
> at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
> at
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 
> 500
> 
> 
>

With Regards
Aman Tandon

On Thu, Oct 8, 2015 at 8:07 PM, Alessandro Benedetti <
benedetti.ale...@gmail.com> wrote:

> Hi agree with Nutch,
> using the Function Range Query Parser, should do your trick :
>
>
> https://lucene.apache.org/solr/5_3_0/solr-core/org/apache/solr/search/FunctionRangeQParserPlugin.html
>
> Cheers
>
> On 8 October 2015 at 13:31, NutchDev <nutchsolru...@gmail.com> wrote:
>
> > Hi Aman,
> >
> > Have a look at this , it has query time approach also using Solr function
> > query,
> >
> >
> >
> http://stackoverflow.com/questions/15927893/how-to-check-equality-of-two-solr-fields
> >
> >
> http://stackoverflow.com/questions/16258605/query-for-document-that-two-fields-are-equal
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Exclude-documents-having-same-data-in-two-fields-tp4233408p4233489.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>
>
> --
> --
>
> Benedetti Alessandro
> Visiting card - http://about.me/alessandro_benedetti
> Blog - http://alexbenedetti.blogspot.co.uk
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon

okay Thanks

With Regards
Aman Tandon

On Fri, Oct 9, 2015 at 4:25 PM, Upayavira <u...@odoko.co.uk> wrote:

> Just beware of performance here. This is fine for smaller indexes, but
> for larger ones won't work so well. It will need to do this calculation
> for every document in your index, thereby undoing all benefits of having
> an inverted index.
>
> If your index (or resultset) is small enough, it can work, but might
> catch you out later.
>
> Upayavira
>
> On Fri, Oct 9, 2015, at 10:59 AM, Aman Tandon wrote:
> > Hi,
> >
> > I tried to use the same as mentioned in the url
> > <
> http://stackoverflow.com/questions/16258605/query-for-document-that-two-fields-are-equal
> >
> > .
> >
> > And I used the description field to check because mapping field
> > is multivalued.
> >
> > So I add the fq={!frange%20l=0%20u=1}strdist(title,description,edit) in
> > my
> > url, but I am getting this error. As mentioned below. Please take a look.
> >
> > *Solr Version 4.8.1*
> >
> > *Url is*
> >
> http://localhost:8150/solr/core1/select?q.alt=*:*=big*,title,catid={!frange%20l=0%20u=1}strdist(title,description,edit)=edismax
> >
> > > 
> > > 
> > > 500
> > > 8
> > > 
> > > *:*
> > > edismax
> > > big*,title,catid
> > > {!frange l=0 u=1}strdist(title,description,edit)
> > > 
> > > 
> > > 
> > > 
> > > java.lang.RuntimeException at
> > >
> org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration.(ExtendedDismaxQParser.java:1455)
> > > at
> > >
> org.apache.solr.search.ExtendedDismaxQParser.createConfiguration(ExtendedDismaxQParser.java:239)
> > > at
> > >
> org.apache.solr.search.ExtendedDismaxQParser.(ExtendedDismaxQParser.java:108)
> > > at
> > >
> org.apache.solr.search.ExtendedDismaxQParserPlugin.createParser(ExtendedDismaxQParserPlugin.java:37)
> > > at org.apache.solr.search.QParser.getParser(QParser.java:315) at
> > >
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:144)
> > > at
> > >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
> > > at
> > >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at
> > >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
> > > at
> > >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> > > at
> > >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > > at
> > >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> > > at
> > >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> > > at
> > >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
> > > at
> > >
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
> > > at
> > >
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
> > > at
> > >
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
> > > at
> > >
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
> > > at
> > >
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
> > > at
> > >
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
> > > at
> > >
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1023)
> > > at
> > >
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
> > > at
> > >
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
> > > at
> > >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at
> > >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > > 
> > > 500
> > > 
> > > 
> > >
> >
> > With Regards
> > Aman Tandon
> >
> > On Thu, Oct 8, 2015 at 8:07 PM, Alessandro Benedetti <
> > benedetti.ale

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon

Thanks Mikhail the suggestion. I will try that on monday will let you know.

*@*Walter This was just an random requirement to find those fields which
are not same and then reindex only those. I can full index but I was
wondering if there might some function or something.

With Regards
Aman Tandon

On Fri, Oct 9, 2015 at 9:05 PM, Mikhail Khludnev <mkhlud...@griddynamics.com
> wrote:

> Aman,
>
> You can invoke Terms Component for the filed M, let it returns terms:
> {a,c,d,f}
> then you invoke it for field T let it return {b,c,f,e},
> then you intersect both lists (it's quite romantic if they are kept
> ordered), you've got {c,f}
> and then you applies filter:
> fq=-((+M:c +T:c) (+M:f +T:f))
> etc
>
>
> On Thu, Oct 8, 2015 at 8:29 AM, Aman Tandon <amantandon...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Is there a way in solr to remove all those documents from the search
> > results in which two of the fields, *mapping* and  *title* is the exactly
> > same.
> >
> > With Regards
> > Aman Tandon
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mkhlud...@griddynamics.com>
>

Re: Exclude documents having same data in two fields

2015-10-09 Thread Aman Tandon

No Susheel, As our index size is 62 GB so it seems hard to find those
records.

With Regards
Aman Tandon

On Fri, Oct 9, 2015 at 7:30 PM, Susheel Kumar <susheel2...@gmail.com> wrote:

> Hi Aman,  Did the problem resolved or still having some errors.
>
> Thnx
>
> On Fri, Oct 9, 2015 at 8:28 AM, Aman Tandon <amantandon...@gmail.com>
> wrote:
>
> > okay Thanks
> >
> > With Regards
> > Aman Tandon
> >
> > On Fri, Oct 9, 2015 at 4:25 PM, Upayavira <u...@odoko.co.uk> wrote:
> >
> > > Just beware of performance here. This is fine for smaller indexes, but
> > > for larger ones won't work so well. It will need to do this calculation
> > > for every document in your index, thereby undoing all benefits of
> having
> > > an inverted index.
> > >
> > > If your index (or resultset) is small enough, it can work, but might
> > > catch you out later.
> > >
> > > Upayavira
> > >
> > > On Fri, Oct 9, 2015, at 10:59 AM, Aman Tandon wrote:
> > > > Hi,
> > > >
> > > > I tried to use the same as mentioned in the url
> > > > <
> > >
> >
> http://stackoverflow.com/questions/16258605/query-for-document-that-two-fields-are-equal
> > > >
> > > > .
> > > >
> > > > And I used the description field to check because mapping field
> > > > is multivalued.
> > > >
> > > > So I add the fq={!frange%20l=0%20u=1}strdist(title,description,edit)
> in
> > > > my
> > > > url, but I am getting this error. As mentioned below. Please take a
> > look.
> > > >
> > > > *Solr Version 4.8.1*
> > > >
> > > > *Url is*
> > > >
> > >
> >
> http://localhost:8150/solr/core1/select?q.alt=*:*=big*,title,catid={!frange%20l=0%20u=1}strdist(title,description,edit)=edismax
> > > >
> > > > > 
> > > > > 
> > > > > 500
> > > > > 8
> > > > > 
> > > > > *:*
> > > > > edismax
> > > > > big*,title,catid
> > > > > {!frange l=0
> u=1}strdist(title,description,edit)
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > java.lang.RuntimeException at
> > > > >
> > >
> >
> org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration.(ExtendedDismaxQParser.java:1455)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.search.ExtendedDismaxQParser.createConfiguration(ExtendedDismaxQParser.java:239)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.search.ExtendedDismaxQParser.(ExtendedDismaxQParser.java:108)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.search.ExtendedDismaxQParserPlugin.createParser(ExtendedDismaxQParserPlugin.java:37)
> > > > > at org.apache.solr.search.QParser.getParser(QParser.java:315) at
> > > > >
> > >
> >
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:144)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > > > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at
> > > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
> > > > > at
> > > > >
> > >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
> > > > > at
> > > > >
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
> > > > > at
> > > > >
> > >
> >
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
> > > > > at
> > > > >
> > >
> >
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
> > > > > at
> > > > >
> > >
> &g

Re: Exclude documents having same data in two fields

2015-10-08 Thread Aman Tandon

But I want to find do it at run time without index extra field

With Regards
Aman Tandon

On Thu, Oct 8, 2015 at 11:55 AM, NutchDev <nutchsolru...@gmail.com> wrote:

> One option could be creating another boolean field field1_equals_field2 and
> set it to true for documents matching it while indexing. Use this field as
> a
> filter criteria while querying solr.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Exclude-documents-having-same-data-in-two-fields-tp4233408p4233411.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

which one is faster synonym_edismax & edismax faster?

2015-10-08 Thread Aman Tandon

Hi,

Currently we are using the *synonym_edismax query parser* plugin to handle
the multi-word synonym. I want to know which is more faster *edismax* or
*synonym_edismax*.

As we are having the very less amount of multi-words in our dictionary so
we are thinking to use standard edismax query parser.

Any suggestions or observations will be helpful.

With Regards
Aman Tandon

Exclude documents having same data in two fields

2015-10-07 Thread Aman Tandon

Hi,

Is there a way in solr to remove all those documents from the search
results in which two of the fields, *mapping* and  *title* is the exactly
same.

With Regards
Aman Tandon

Re: How to know index file in OS Cache

2015-09-25 Thread Aman Tandon

okay thanks Markus :)

With Regards
Aman Tandon

On Fri, Sep 25, 2015 at 12:27 PM, Markus Jelsma <markus.jel...@openindex.io>
wrote:

> Hello - as far as i remember, you don't. A file itself is not the unit to
> cache, but blocks are.
> Markus
>
>
> -Original message-
> > From:Aman Tandon <amantandon...@gmail.com>
> > Sent: Friday 25th September 2015 5:56
> > To: solr-user@lucene.apache.org
> > Subject: How to know index file in OS Cache
> >
> > Hi,
> >
> > Is there any way to know that the index file/s is present in the OS cache
> > or RAM. I want to check if the index is present in the RAM or in OS cache
> > and which files are not in either of them.
> >
> > With Regards
> > Aman Tandon
> >
>

Re: How to know index file in OS Cache

2015-09-25 Thread Aman Tandon

Awesome thank you Mikhail. This is what I was looking for.

This was just a random question poped up in my mind. So I just asked this
on the group.

With Regards
Aman Tandon

On Fri, Sep 25, 2015 at 2:49 PM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:

> What about Linux:
> $less /proc//maps
> $pmap 
>
> On Fri, Sep 25, 2015 at 10:57 AM, Markus Jelsma <
> markus.jel...@openindex.io>
> wrote:
>
> > Hello - as far as i remember, you don't. A file itself is not the unit to
> > cache, but blocks are.
> > Markus
> >
> >
> > -Original message-
> > > From:Aman Tandon <amantandon...@gmail.com>
> > > Sent: Friday 25th September 2015 5:56
> > > To: solr-user@lucene.apache.org
> > > Subject: How to know index file in OS Cache
> > >
> > > Hi,
> > >
> > > Is there any way to know that the index file/s is present in the OS
> cache
> > > or RAM. I want to check if the index is present in the RAM or in OS
> cache
> > > and which files are not in either of them.
> > >
> > > With Regards
> > > Aman Tandon
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mkhlud...@griddynamics.com>
>

How to know index file in OS Cache

2015-09-24 Thread Aman Tandon

Hi,

Is there any way to know that the index file/s is present in the OS cache
or RAM. I want to check if the index is present in the RAM or in OS cache
and which files are not in either of them.

With Regards
Aman Tandon

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon

Sure. thank you Upayavira

With Regards
Aman Tandon

On Mon, Sep 21, 2015 at 6:01 PM, Upayavira <u...@odoko.co.uk> wrote:

> You cannot do multi valued fields with LatLongType fields. Therefore, if
> that is a need, you will have to investigate RPT fields.
>
> I'm not sure how you do distance boosting there, so I'd suggest you ask
> that as a separate question with a new title.
>
> Upayavira
>
> On Mon, Sep 21, 2015, at 01:27 PM, Aman Tandon wrote:
> > We are using LatLonType to use the gradual boosting / distance based
> > boosting of search results.
> >
> > With Regards
> > Aman Tandon
> >
> > On Mon, Sep 21, 2015 at 5:39 PM, Upayavira <u...@odoko.co.uk> wrote:
> >
> > > Aman,
> > >
> > > I cannot promise to answer questions promptly - like most people on
> this
> > > list, we answer if/when we have a gap in our workload.
> > >
> > > The reason you are getting the non multiValued field error is because
> > > your latlon field does not have multiValued="true" enabled.
> > >
> > > However, the field type definition notes that this field type does not
> > > support multivalued fields, so you're not gonna get anywhere with that
> > > route.
> > >
> > > Have you tried the location_rpt type?
> > > (solr.SpatialRecursivePrefixTreeFieldType). This is a newer, and as I
> > > understand it, far more flexible field type - for example, you can
> index
> > > shapes into it as well as locations.
> > >
> > > I'd suggest you read this page, and pay particular attention to
> mentions
> > > of RPT:
> > >
> > > https://cwiki.apache.org/confluence/display/solr/Spatial+Search
> > >
> > > Upayavira
> > >
> > > On Mon, Sep 21, 2015, at 10:36 AM, Aman Tandon wrote:
> > > > Upayavira, please help
> > > >
> > > > With Regards
> > > > Aman Tandon
> > > >
> > > > On Mon, Sep 21, 2015 at 2:38 PM, Aman Tandon <
> amantandon...@gmail.com>
> > > > wrote:
> > > >
> > > > > Error is
> > > > >
> > > > > 
> > > > > 
> > > > > 400 > > > > name="QTime">28ERROR:
> > > > > [doc=9474144846] multiple values encountered for non multiValued
> field
> > > > > latlon_0_coordinate: [11.0183, 11.0183] > > > > name="code">400
> > > > > 
> > > > >
> > > > > And my configuration is
> > > > >
> > > > > 
> > > > > > > > >  stored="true" />
> > > > >
> > > > >  
> > > > >  > > > > subFieldSuffix="_coordinate"/>
> > > > >
> > > > >> > > > required="false" multiValued="false" />
> > > > >
> > > > >  how you know it is because of stored="true"?
> > > > >
> > > > > As Erick replied in the last mail thread,
> > > > > I'm not getting any multiple values in the _coordinate fields.
> > > However, I
> > > > > _do_ get the error if my dynamic *_coordinate field is set to
> > > > > stored="true".
> > > > >
> > > > > And stored="true" is mandatory for using the atomic updates.
> > > > >
> > > > > With Regards
> > > > > Aman Tandon
> > > > >
> > > > > On Mon, Sep 21, 2015 at 2:22 PM, Upayavira <u...@odoko.co.uk> wrote:
> > > > >
> > > > >> Can you show the error you are getting, and how you know it is
> because
> > > > >> of stored="true"?
> > > > >>
> > > > >> Upayavira
> > > > >>
> > > > >> On Mon, Sep 21, 2015, at 09:30 AM, Aman Tandon wrote:
> > > > >> > Hi Erick,
> > > > >> >
> > > > >> > I am getting the same error because my dynamic field
> *_coordinate is
> > > > >> > stored="true".
> > > > >> > How can I get rid of this error?
> > > > >> >
> > > > >> > And I have to use the atomic update. Please help!!
> > > > >> >
> > > > >> > With Regards
> > > > >> > Aman Tandon
> > > > >>

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon

We are using LatLonType to use the gradual boosting / distance based
boosting of search results.

With Regards
Aman Tandon

On Mon, Sep 21, 2015 at 5:39 PM, Upayavira <u...@odoko.co.uk> wrote:

> Aman,
>
> I cannot promise to answer questions promptly - like most people on this
> list, we answer if/when we have a gap in our workload.
>
> The reason you are getting the non multiValued field error is because
> your latlon field does not have multiValued="true" enabled.
>
> However, the field type definition notes that this field type does not
> support multivalued fields, so you're not gonna get anywhere with that
> route.
>
> Have you tried the location_rpt type?
> (solr.SpatialRecursivePrefixTreeFieldType). This is a newer, and as I
> understand it, far more flexible field type - for example, you can index
> shapes into it as well as locations.
>
> I'd suggest you read this page, and pay particular attention to mentions
> of RPT:
>
> https://cwiki.apache.org/confluence/display/solr/Spatial+Search
>
> Upayavira
>
> On Mon, Sep 21, 2015, at 10:36 AM, Aman Tandon wrote:
> > Upayavira, please help
> >
> > With Regards
> > Aman Tandon
> >
> > On Mon, Sep 21, 2015 at 2:38 PM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> >
> > > Error is
> > >
> > > 
> > > 
> > > 400 > > name="QTime">28ERROR:
> > > [doc=9474144846] multiple values encountered for non multiValued field
> > > latlon_0_coordinate: [11.0183, 11.0183] > > name="code">400
> > > 
> > >
> > > And my configuration is
> > >
> > > 
> > > > >  stored="true" />
> > >
> > >  
> > >  > > subFieldSuffix="_coordinate"/>
> > >
> > >> > required="false" multiValued="false" />
> > >
> > >  how you know it is because of stored="true"?
> > >
> > > As Erick replied in the last mail thread,
> > > I'm not getting any multiple values in the _coordinate fields.
> However, I
> > > _do_ get the error if my dynamic *_coordinate field is set to
> > > stored="true".
> > >
> > > And stored="true" is mandatory for using the atomic updates.
> > >
> > > With Regards
> > > Aman Tandon
> > >
> > > On Mon, Sep 21, 2015 at 2:22 PM, Upayavira <u...@odoko.co.uk> wrote:
> > >
> > >> Can you show the error you are getting, and how you know it is because
> > >> of stored="true"?
> > >>
> > >> Upayavira
> > >>
> > >> On Mon, Sep 21, 2015, at 09:30 AM, Aman Tandon wrote:
> > >> > Hi Erick,
> > >> >
> > >> > I am getting the same error because my dynamic field *_coordinate is
> > >> > stored="true".
> > >> > How can I get rid of this error?
> > >> >
> > >> > And I have to use the atomic update. Please help!!
> > >> >
> > >> > With Regards
> > >> > Aman Tandon
> > >> >
> > >> > On Tue, Aug 5, 2014 at 10:27 PM, Franco Giacosa <fgiac...@gmail.com
> >
> > >> > wrote:
> > >> >
> > >> > > Hey Erick, i think that you were right, there was a mix in the
> > >> schemas and
> > >> > > that was generating the error on some of the documents.
> > >> > >
> > >> > > Thanks for the help guys!
> > >> > >
> > >> > >
> > >> > > 2014-08-05 1:28 GMT-03:00 Erick Erickson <erickerick...@gmail.com
> >:
> > >> > >
> > >> > > > Hmmm, I jus tried this with a 4.x build and I can update the
> > >> document
> > >> > > > multiple times without a problem. I just indexed the standard
> > >> exampledocs
> > >> > > > and then updated a doc like this (vidcard.xml was the base):
> > >> > > >
> > >> > > > 
> > >> > > > 
> > >> > > >   EN7800GTX/2DHTV/256M
> > >> > > >
> > >> > > >   eoe changed this
> > >> puppy
> > >> > > > 
> > >> > > >   
> > >> > > > 
> > >> > > >
> > >> > > > I'm not getting any multiple values in the _coordinate

Spatial Search: distance based boosting

2015-09-21 Thread Aman Tandon

Hi,

Is there a way in solr to do the distance based boosting using Spatial RPT
field?

With Regards
Aman Tandon

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon

Upayavira, please help

With Regards
Aman Tandon

On Mon, Sep 21, 2015 at 2:38 PM, Aman Tandon <amantandon...@gmail.com>
wrote:

> Error is
>
> 
> 
> 400 name="QTime">28ERROR:
> [doc=9474144846] multiple values encountered for non multiValued field
> latlon_0_coordinate: [11.0183, 11.0183] name="code">400
> 
>
> And my configuration is
>
> 
>  stored="true" />
>
>  
>  subFieldSuffix="_coordinate"/>
>
>required="false" multiValued="false" />
>
>  how you know it is because of stored="true"?
>
> As Erick replied in the last mail thread,
> I'm not getting any multiple values in the _coordinate fields. However, I
> _do_ get the error if my dynamic *_coordinate field is set to
> stored="true".
>
> And stored="true" is mandatory for using the atomic updates.
>
> With Regards
> Aman Tandon
>
> On Mon, Sep 21, 2015 at 2:22 PM, Upayavira <u...@odoko.co.uk> wrote:
>
>> Can you show the error you are getting, and how you know it is because
>> of stored="true"?
>>
>> Upayavira
>>
>> On Mon, Sep 21, 2015, at 09:30 AM, Aman Tandon wrote:
>> > Hi Erick,
>> >
>> > I am getting the same error because my dynamic field *_coordinate is
>> > stored="true".
>> > How can I get rid of this error?
>> >
>> > And I have to use the atomic update. Please help!!
>> >
>> > With Regards
>> > Aman Tandon
>> >
>> > On Tue, Aug 5, 2014 at 10:27 PM, Franco Giacosa <fgiac...@gmail.com>
>> > wrote:
>> >
>> > > Hey Erick, i think that you were right, there was a mix in the
>> schemas and
>> > > that was generating the error on some of the documents.
>> > >
>> > > Thanks for the help guys!
>> > >
>> > >
>> > > 2014-08-05 1:28 GMT-03:00 Erick Erickson <erickerick...@gmail.com>:
>> > >
>> > > > Hmmm, I jus tried this with a 4.x build and I can update the
>> document
>> > > > multiple times without a problem. I just indexed the standard
>> exampledocs
>> > > > and then updated a doc like this (vidcard.xml was the base):
>> > > >
>> > > > 
>> > > > 
>> > > >   EN7800GTX/2DHTV/256M
>> > > >
>> > > >   eoe changed this
>> puppy
>> > > > 
>> > > >   
>> > > > 
>> > > >
>> > > > I'm not getting any multiple values in the _coordinate fields.
>> However, I
>> > > > _do_ get the error if my dynamic *_coordinate field is set to
>> > > > stored="true".
>> > > >
>> > > > Did you perhaps change this at some point? Whenever I change the
>> schema,
>> > > I
>> > > > try to 'rm -rf solr/collection/data' just to be sure I've purged all
>> > > traces
>> > > > of the former schema definition.
>> > > >
>> > > > Best,
>> > > > Erick
>> > > >
>> > > >
>> > > > On Mon, Aug 4, 2014 at 7:04 PM, Franco Giacosa <fgiac...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > No, they are not declarad explicitly.
>> > > > >
>> > > > > This is how they are created:
>> > > > >
>> > > > > > stored="true"/>
>> > > > >
>> > > > > > > > > >  stored="false"/>
>> > > > >
>> > > > > > > > > > subFieldSuffix="_coordinate"/>
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > 2014-08-04 22:28 GMT-03:00 Michael Ryan <mr...@moreover.com>:
>> > > > >
>> > > > > > Are the latLong_0_coordinate and latLong_1_coordinate fields
>> > > populated
>> > > > > > using copyField? If so, this sounds like it could be
>> > > > > > https://issues.apache.org/jira/browse/SOLR-3502.
>> > > > > >
>> > > > > > -Michael
>> > > > > >
>> > > > > > -Original Message-
>> > > > > > From: Franco Giacosa [mailto:fgiac...@gmail.com]
>> > > > > > Sent: Monday, August 04, 2014 9:05 PM
&

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon

Hi Erick,

I am getting the same error because my dynamic field *_coordinate is
stored="true".
How can I get rid of this error?

And I have to use the atomic update. Please help!!

With Regards
Aman Tandon

On Tue, Aug 5, 2014 at 10:27 PM, Franco Giacosa <fgiac...@gmail.com> wrote:

> Hey Erick, i think that you were right, there was a mix in the schemas and
> that was generating the error on some of the documents.
>
> Thanks for the help guys!
>
>
> 2014-08-05 1:28 GMT-03:00 Erick Erickson <erickerick...@gmail.com>:
>
> > Hmmm, I jus tried this with a 4.x build and I can update the document
> > multiple times without a problem. I just indexed the standard exampledocs
> > and then updated a doc like this (vidcard.xml was the base):
> >
> > 
> > 
> >   EN7800GTX/2DHTV/256M
> >
> >   eoe changed this puppy
> > 
> >   
> > 
> >
> > I'm not getting any multiple values in the _coordinate fields. However, I
> > _do_ get the error if my dynamic *_coordinate field is set to
> > stored="true".
> >
> > Did you perhaps change this at some point? Whenever I change the schema,
> I
> > try to 'rm -rf solr/collection/data' just to be sure I've purged all
> traces
> > of the former schema definition.
> >
> > Best,
> > Erick
> >
> >
> > On Mon, Aug 4, 2014 at 7:04 PM, Franco Giacosa <fgiac...@gmail.com>
> wrote:
> >
> > > No, they are not declarad explicitly.
> > >
> > > This is how they are created:
> > >
> > > 
> > >
> > >  > >  stored="false"/>
> > >
> > >  > > subFieldSuffix="_coordinate"/>
> > >
> > >
> > >
> > >
> > > 2014-08-04 22:28 GMT-03:00 Michael Ryan <mr...@moreover.com>:
> > >
> > > > Are the latLong_0_coordinate and latLong_1_coordinate fields
> populated
> > > > using copyField? If so, this sounds like it could be
> > > > https://issues.apache.org/jira/browse/SOLR-3502.
> > > >
> > > > -Michael
> > > >
> > > > -Original Message-
> > > > From: Franco Giacosa [mailto:fgiac...@gmail.com]
> > > > Sent: Monday, August 04, 2014 9:05 PM
> > > > To: solr-user@lucene.apache.org
> > > > Subject: solr update dynamic field generates multiValued error
> > > >
> > > > Hello everyone, this is my first time posting a question, so forgive
> me
> > > if
> > > > i'm missing something.
> > > >
> > > > This is my problem:
> > > >
> > > > I have a schema.xml that has the following latLong information
> > > >
> > > > The dynamicField generates 2 dynamic fields that have the lat and the
> > > long
> > > > (latLong_0_coordinate and latLong_1_coordinate)
> > > >
> > > > So for example a document will have
> > > >
> > > > "latLong_0_coordinate": 40.4114, "latLong_1_coordinate": -74.1031,
> > > > "latLong": "40.4114,-74.1031",
> > > >
> > > > Now when I try to update a document (i don't update the latLong
> field.
> > I
> > > > just update other parts of the document using atomic update) solr
> > > > re-creates the dynamicField and adds the same value again, like its
> > using
> > > > add instead of set. So when i do an update the fields of the doc look
> > > like
> > > > this
> > > >
> > > > "latLong_0_coordinate": [40.4114,40.4114] "latLong_1_coordinate":
> > > > [-74.1031,-74.1031] "latLong": "40.4114,-74.1031",
> > > >
> > > > So the dynamicFields now have 2 values, so the next time that I want
> to
> > > > update the document a schema error is throw because im trying to
> store
> > a
> > > > collection into a none multivalued field.
> > > >
> > > >
> > > > Thanks in advanced.
> > > >
> > >
> >
>

Re: solr update dynamic field generates multiValued error

2015-09-21 Thread Aman Tandon

Error is



40028ERROR:
[doc=9474144846] multiple values encountered for non multiValued field
latlon_0_coordinate: [11.0183, 11.0183]400


And my configuration is


   

 


  

 how you know it is because of stored="true"?

As Erick replied in the last mail thread,
I'm not getting any multiple values in the _coordinate fields. However, I
_do_ get the error if my dynamic *_coordinate field is set to stored="true".

And stored="true" is mandatory for using the atomic updates.

With Regards
Aman Tandon

On Mon, Sep 21, 2015 at 2:22 PM, Upayavira <u...@odoko.co.uk> wrote:

> Can you show the error you are getting, and how you know it is because
> of stored="true"?
>
> Upayavira
>
> On Mon, Sep 21, 2015, at 09:30 AM, Aman Tandon wrote:
> > Hi Erick,
> >
> > I am getting the same error because my dynamic field *_coordinate is
> > stored="true".
> > How can I get rid of this error?
> >
> > And I have to use the atomic update. Please help!!
> >
> > With Regards
> > Aman Tandon
> >
> > On Tue, Aug 5, 2014 at 10:27 PM, Franco Giacosa <fgiac...@gmail.com>
> > wrote:
> >
> > > Hey Erick, i think that you were right, there was a mix in the schemas
> and
> > > that was generating the error on some of the documents.
> > >
> > > Thanks for the help guys!
> > >
> > >
> > > 2014-08-05 1:28 GMT-03:00 Erick Erickson <erickerick...@gmail.com>:
> > >
> > > > Hmmm, I jus tried this with a 4.x build and I can update the document
> > > > multiple times without a problem. I just indexed the standard
> exampledocs
> > > > and then updated a doc like this (vidcard.xml was the base):
> > > >
> > > > 
> > > > 
> > > >   EN7800GTX/2DHTV/256M
> > > >
> > > >   eoe changed this puppy
> > > > 
> > > >   
> > > > 
> > > >
> > > > I'm not getting any multiple values in the _coordinate fields.
> However, I
> > > > _do_ get the error if my dynamic *_coordinate field is set to
> > > > stored="true".
> > > >
> > > > Did you perhaps change this at some point? Whenever I change the
> schema,
> > > I
> > > > try to 'rm -rf solr/collection/data' just to be sure I've purged all
> > > traces
> > > > of the former schema definition.
> > > >
> > > > Best,
> > > > Erick
> > > >
> > > >
> > > > On Mon, Aug 4, 2014 at 7:04 PM, Franco Giacosa <fgiac...@gmail.com>
> > > wrote:
> > > >
> > > > > No, they are not declarad explicitly.
> > > > >
> > > > > This is how they are created:
> > > > >
> > > > >  stored="true"/>
> > > > >
> > > > >  > > > >  stored="false"/>
> > > > >
> > > > >  > > > > subFieldSuffix="_coordinate"/>
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > 2014-08-04 22:28 GMT-03:00 Michael Ryan <mr...@moreover.com>:
> > > > >
> > > > > > Are the latLong_0_coordinate and latLong_1_coordinate fields
> > > populated
> > > > > > using copyField? If so, this sounds like it could be
> > > > > > https://issues.apache.org/jira/browse/SOLR-3502.
> > > > > >
> > > > > > -Michael
> > > > > >
> > > > > > -Original Message-
> > > > > > From: Franco Giacosa [mailto:fgiac...@gmail.com]
> > > > > > Sent: Monday, August 04, 2014 9:05 PM
> > > > > > To: solr-user@lucene.apache.org
> > > > > > Subject: solr update dynamic field generates multiValued error
> > > > > >
> > > > > > Hello everyone, this is my first time posting a question, so
> forgive
> > > me
> > > > > if
> > > > > > i'm missing something.
> > > > > >
> > > > > > This is my problem:
> > > > > >
> > > > > > I have a schema.xml that has the following latLong information
> > > > > >
> > > > > > The dynamicField generates 2 dynamic fields that have the lat
> and the
> > > > > long
> > > > > > (latLong_0_coordinate and latLong_1_coordinate)
> > > > > >
> > > > > > So for example a document will have
> > > > > >
> > > > > > "latLong_0_coordinate": 40.4114, "latLong_1_coordinate":
> -74.1031,
> > > > > > "latLong": "40.4114,-74.1031",
> > > > > >
> > > > > > Now when I try to update a document (i don't update the latLong
> > > field.
> > > > I
> > > > > > just update other parts of the document using atomic update) solr
> > > > > > re-creates the dynamicField and adds the same value again, like
> its
> > > > using
> > > > > > add instead of set. So when i do an update the fields of the doc
> look
> > > > > like
> > > > > > this
> > > > > >
> > > > > > "latLong_0_coordinate": [40.4114,40.4114] "latLong_1_coordinate":
> > > > > > [-74.1031,-74.1031] "latLong": "40.4114,-74.1031",
> > > > > >
> > > > > > So the dynamicFields now have 2 values, so the next time that I
> want
> > > to
> > > > > > update the document a schema error is throw because im trying to
> > > store
> > > > a
> > > > > > collection into a none multivalued field.
> > > > > >
> > > > > >
> > > > > > Thanks in advanced.
> > > > > >
> > > > >
> > > >
> > >
>

Re: How to reordering search result by some function query

2015-09-10 Thread Aman Tandon

>
> boost=product_guideline_score

Thank you  Upayavira.

Leonardo, thanks for the suggestion. But I think boost parameter will work
great for us. Thank you so much for your help.

With Regards
Aman Tandon

On Thu, Sep 10, 2015 at 5:11 PM, Upayavira <u...@odoko.co.uk> wrote:

> Aman,
>
> If you are using edismax then what you have written is just fine.
>
> For Lucene query parser queries, wrap them with the boost query parser:
>
> q={!boost b=product_guideline_score v=$qq}=jute
>
> Note in your example you don't need product(), just do
> boost=product_guideline_score
>
> Upayavira
>
> On Thu, Sep 10, 2015, at 07:33 AM, Aman Tandon wrote:
> > Hi,
> >
> > I figured it out to implement the same. I will be doing this by using the
> > boost parameter
> >
> > e.g. http://server:8112/solr/products/select?q=jute=title
> > *=product(1,product_guideline_score)*
> >
> > If there is any other alternative then please suggest.
> >
> > With Regards
> > Aman Tandon
> >
> > On Thu, Sep 10, 2015 at 11:02 AM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I have a requirement to reorder the search results by multiplying the
> *text relevance
> > > score* of a product with the *product_guideline_score,* which will be
> > > stored in index and will have some floating point number.
> > >
> > > e.g. On searching the *jute* in title if we got some results ID1 & ID2
> > >
> > > ID1 -> title = jute
> > >   score = 8.0
> > > *  product_guideline_score = 2.0*
> > >
> > > ID2 -> title = jute bags
> > >   score = 7.5
> > > *  product_guideline_score** = 2.2*
> > >
> > > So the new score should be like this
> > >
> > > ID1 -> title = jute
> > >   score = *product_score * 8 = 16.0*
> > > *  product_guideline_score** = 2.0*
> > >
> > > ID2 -> title = jute bags
> > >   score = *product_score * 7.5 = 16.5*
> > > *  product_guideline_score** = 2.2*
> > >
> > > *So new ordering should be*
> > >
> > > ID2 -> title = jute bags
> > >   score* = 16.5*
> > >
> > > ID1 -> title = jute
> > >   score =* 16.0*
> > >
> > > How can I do this in single query on runtime in solr.
> > >
> > > With Regards
> > > Aman Tandon
> > >
>

Re: How to reordering search result by some function query

2015-09-10 Thread Aman Tandon

Hi,

I figured it out to implement the same. I will be doing this by using the
boost parameter

e.g. http://server:8112/solr/products/select?q=jute=title
*=product(1,product_guideline_score)*

If there is any other alternative then please suggest.

With Regards
Aman Tandon

On Thu, Sep 10, 2015 at 11:02 AM, Aman Tandon <amantandon...@gmail.com>
wrote:

> Hi,
>
> I have a requirement to reorder the search results by multiplying the *text 
> relevance
> score* of a product with the *product_guideline_score,* which will be
> stored in index and will have some floating point number.
>
> e.g. On searching the *jute* in title if we got some results ID1 & ID2
>
> ID1 -> title = jute
>   score = 8.0
> *  product_guideline_score = 2.0*
>
> ID2 -> title = jute bags
>   score = 7.5
> *  product_guideline_score** = 2.2*
>
> So the new score should be like this
>
> ID1 -> title = jute
>   score = *product_score * 8 = 16.0*
> *  product_guideline_score** = 2.0*
>
> ID2 -> title = jute bags
>   score = *product_score * 7.5 = 16.5*
> *  product_guideline_score** = 2.2*
>
> *So new ordering should be*
>
> ID2 -> title = jute bags
>   score* = 16.5*
>
> ID1 -> title = jute
>   score =* 16.0*
>
> How can I do this in single query on runtime in solr.
>
> With Regards
> Aman Tandon
>

Boosting related doubt?

2015-09-09 Thread Aman Tandon

Hi,

I need to ask that when i am looking for the all the parameters of the
query using the *echoParams=ALL*, I am getting the boost parameter twice in
the information printed on the browser screen.

So does it mean that it is also applying twice on the data/result set and
we are using the ?


**
*  0*
*  66*
*  *
**
*  map(query({!dismax qf=mcatid v=$mc1 pf=""}),0,0,1,2.0)*
*  map(eff_views,1,2,1.15,1)*
*  map(query({!dismax qf=titlex v=$ql1 pf=""}),0,0,1,1.5)*
*  map(query({!dismax qf=titlex v=$ql2 pf=""}),0,0,1,1.5)*
*  map(query({!dismax qf=attribs v='poorDescription'
pf=''},0),0,0,1,0.02)*
*  if(exists(itemprice2),map(query({!dismax qf=itemprice2
v='0'}),0,0,1.2,1),1)*
*  map(sdesclen,0,150,1,1.5)*
*  map(sdesclen,0,0,0.1,1)*
*  map(CustTypeWt,700,1869,1.1,1)*
*  map(CustTypeWt,699,699,1.2,1)*
*  map(CustTypeWt,199,199,1.3,1)*
*  map(CustTypeWt,0,179,1.35,1)*
*  map(CustTypeWt,3399,3999,0.07,1)*
*  map(query({!dismax qf=attribs v='hot'}),0,0,1,1.2)*
*  map(query({!dismax qf=isphoto v='true'
pf=""}),0,0,0.05,1)*
**
**
*  mcatid:(1223 6240 825 1936 31235)
titlex:("imswjutebagimsw")*
*  attribs:(locprefglobal locprefnational locprefcity
locprefunknown)*
*  displayid:4768979112*
*  +((+datatype:product +attribs:(aprstatus20 aprstatus40
aprstatus50) +aggregate:true -attribs:liststatusnfl +((+countryiso:IN
+isfcp:true) (+CustTypeWt:[149 TO 1499]) CustTypeWt:1870))
(+datatype:company -attribs:liststatusnfl +((+countryiso:IN +isfcp:true)
(+CustTypeWt:[149 TO 1499]) CustTypeWt:1870))) -attribs:liststatusdnf*
**
*2-1 470%*
**
*  {!ex=cityf}city*
*  {!ex=datatypef}datatype*
*  {!ex=biztypef}biztype*
**
*default*
*ALL*
*displayid,datatype,title,smalldescorg,photo,catid,mcatname,companyname,CustTypeWt,glusrid,usrpcatflname,paidurl,fcpurl,city,state,countryname,countryiso,tscode,address,state,zipcode,phone,mobile,contactperson,pns,dupimg,smalldesc,etoofrqty,lastactiondatet,mcatid,isadult,pnsdisabled,membersince,locpref,categoryinfo,distance:geodist($lat,$lon,latlon),iildisplayflag,dispflagval,biztype,datarefid,parentglusrid,itemcode,itemprice,itemcurrency,largedesc,ecom_url,ecom_source_id,moq,moq_type*
*0*
*20*
*true*
*true*
*15*
*true*
**
*  mcatnametext^0.2*
*  titlews^0.5*
*  smalldesc^0.01*
*  title_text^1.5*
*  usrpcatname^0.1*
*  customspell^0.1*
**
*true*
**
*  mcatnametext^0.5*
*  titlews*
*  title_text^3*
*  usrpcatname^0.1*
*  smalldesc^0.01*
*  customspell^0.1*
**
*true*
*1*
*10*
*xml*
*true*
*0*
*parentglusrid*
*true*
*true*
*im.search*
*2*
*true*
*ALL*
*1*
*0*
**
*  mcatid:(1223 6240 825 1936 31235)
titlex:("imswjutebagimsw")*
*  attribs:(locprefglobal locprefnational locprefcity
locprefunknown)*
*  displayid:4768979112*
*  +((+datatype:product +attribs:(aprstatus20 aprstatus40
aprstatus50) +aggregate:true -attribs:liststatusnfl +((+countryiso:IN
+isfcp:true) (+CustTypeWt:[149 TO 1499]) CustTypeWt:1870))
(+datatype:company -attribs:liststatusnfl +((+countryiso:IN +isfcp:true)
(+CustTypeWt:[149 TO 1499]) CustTypeWt:1870))) -attribs:liststatusdnf*
**
*20*
*jute bags*
*true*
*"jutebagimsw"*
*"bagimsw"*
*"1223"*
**
*  map(query({!dismax qf=mcatid v=$mc1 pf=""}),0,0,1,2.0)*
*  map(eff_views,1,2,1.15,1)*
*  map(query({!dismax qf=titlex v=$ql1 pf=""}),0,0,1,1.5)*
*  map(query({!dismax qf=titlex v=$ql2 pf=""}),0,0,1,1.5)*
*  map(query({!dismax qf=attribs v='poorDescription'
pf=''},0),0,0,1,0.02)*
*  if(exists(itemprice2),map(query({!dismax qf=itemprice2
v='0'}),0,0,1.2,1),1)*
*  map(sdesclen,0,150,1,1.5)*
*  map(sdesclen,0,0,0.1,1)*
*  map(CustTypeWt,700,1869,1.1,1)*
*  map(CustTypeWt,699,699,1.2,1)*
*  map(CustTypeWt,199,199,1.3,1)*
*  map(CustTypeWt,0,179,1.35,1)*
*  map(CustTypeWt,3399,3999,0.07,1)*
*  map(query({!dismax qf=attribs v='hot'}),0,0,1,1.2)*
*  map(query({!dismax qf=isphoto v='true'
pf=""}),0,0,0.05,1)*
**
*xml*
*0*
*0.3*
*synonym_edismax*
*on*
*true*
*  *
**


With Regards
Aman Tandon

How to reordering search result by some function query

2015-09-09 Thread Aman Tandon

Hi,

I have a requirement to reorder the search results by multiplying the
*text relevance
score* of a product with the *product_guideline_score,* which will be
stored in index and will have some floating point number.

e.g. On searching the *jute* in title if we got some results ID1 & ID2

ID1 -> title = jute
  score = 8.0
*  product_guideline_score = 2.0*

ID2 -> title = jute bags
  score = 7.5
*  product_guideline_score** = 2.2*

So the new score should be like this

ID1 -> title = jute
  score = *product_score * 8 = 16.0*
*  product_guideline_score** = 2.0*

ID2 -> title = jute bags
  score = *product_score * 7.5 = 16.5*
*  product_guideline_score** = 2.2*

*So new ordering should be*

ID2 -> title = jute bags
  score* = 16.5*

ID1 -> title = jute
  score =* 16.0*

How can I do this in single query on runtime in solr.

With Regards
Aman Tandon

Re: Maximum Number of entires in External Field?

2015-09-08 Thread Aman Tandon

>
> I can provide examples if needed.

Yes that will be so much helpful. Thank you so much.

Then I will try both methodology. And will report the results back here.

With Regards
Aman Tandon

On Tue, Sep 8, 2015 at 2:11 PM, Upayavira <u...@odoko.co.uk> wrote:

> If you have just 5-7 items, then an external file will work, as will the
> join query. You'll need to handle the 'default' case with the join
> query, that is, making sure you do  OR  so that
> documents matching the join are boosted above those matching the main
> query, rather than the join being a filter on the main query.
>
> I can provide examples if needed.
>
> Upayavira
>
> On Mon, Sep 7, 2015, at 07:21 PM, Aman Tandon wrote:
> > I am currently doing boosting for 5-7 things. will  it work great with
> > this
> > too?
> >
> > With Regards
> > Aman Tandon
> >
> > On Mon, Sep 7, 2015 at 11:42 PM, Upayavira <u...@odoko.co.uk> wrote:
> >
> > > External file field would work, but requires a full import of the
> > > external file field every time you change a single entry, which is
> > > pretty extreme.
> > >
> > > I've tested out "score joins" which seemed to perform very well and
> > > achieved the same effect, but using another core, rather than an
> > > external file.
> > >
> > > Thus:
> > >
> > > {!join score=max fromIndex=prices from=id to=id}{!boost b=price}*:*
> > >
> > > seemed to do the job of using the price as a boost. Of course you could
> > > extend this like so:
> > >
> > > q={!join score=max fromIndex=prices from=id to=id}{!boost b=$b}*:*
> > > b=sqrt(price)
> > >
> > > or such things to make the price a more reasonable value.
> > >
> > > Upayavira
> > >
> > > On Mon, Sep 7, 2015, at 06:21 PM, Aman Tandon wrote:
> > > > Any suggestions?
> > > >
> > > > With Regards
> > > > Aman Tandon
> > > >
> > > > On Mon, Sep 7, 2015 at 1:07 PM, Aman Tandon <amantandon...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi Upayavira,
> > > > >
> > > > > Have you tried it?
> > > > >
> > > > >
> > > > > No
> > > > >
> > > > > E.g. external file fields don't play nice with Solr Cloud
> > > > >
> > > > >
> > > > > We are not using Solr Cloud.
> > > > >
> > > > >
> > > > >> What are you using the external file for?
> > > > >
> > > > >
> > > > > We are doing the boosting in the search result which are *having
> price
> > > by
> > > > > 1.2* &  *country is India by 1.1*. We are doing by using the
> boosting
> > > > > parameter in conjucation with query & map function e.g.
> > > *=map(query({!dismax
> > > > > qf=hasPrice v='yes' pf=''},0),1,1,1,1)*
> > > > >
> > > > > This is being done with 5/6 parameters. And I am hoping it will
> > > increase
> > > > > query time. So I am planning to make the single score and populate
> it
> > > in
> > > > > external file field. And this might reduce some time.
> > > > >
> > > > > Just to mention we are doing incremental updates after every 10
> > > minutes.
> > > > >
> > > > > With Regards
> > > > > Aman Tandon
> > > > >
> > > > > On Mon, Sep 7, 2015 at 12:53 PM, Upayavira <u...@odoko.co.uk> wrote:
> > > > >
> > > > >> Have you tried it? I suspect your issue will be with the process
> of
> > > > >> reloading the external file rather than consuming it once loaded.
> > > > >>
> > > > >> What are you using the external file for? There may be other ways
> > > also.
> > > > >> E.g. external file fields don't play nice with Solr Cloud.
> > > > >>
> > > > >> Upayavira
> > > > >>
> > > > >> On Mon, Sep 7, 2015, at 07:05 AM, Aman Tandon wrote:
> > > > >> > Hi,
> > > > >> >
> > > > >> > How much ids information can I define in External File?
> Currently I
> > > am
> > > > >> > having the 100 Million records in my index.
> > > > >> >
> > > > >> > With Regards
> > > > >> > Aman Tandon
> > > > >>
> > > > >
> > > > >
> > >
>

Re: Maximum Number of entires in External Field?

2015-09-07 Thread Aman Tandon

Hi Upayavira,

Have you tried it?

No

E.g. external file fields don't play nice with Solr Cloud

We are not using Solr Cloud.

> What are you using the external file for?

We are doing the boosting in the search result which are *having price by
1.2* &  *country is India by 1.1*. We are doing by using the boosting
parameter in conjucation with query & map function e.g.
*=map(query({!dismax
qf=hasPrice v='yes' pf=''},0),1,1,1,1)*

This is being done with 5/6 parameters. And I am hoping it will increase
query time. So I am planning to make the single score and populate it in
external file field. And this might reduce some time.

Just to mention we are doing incremental updates after every 10 minutes.

With Regards
Aman Tandon

On Mon, Sep 7, 2015 at 12:53 PM, Upayavira <u...@odoko.co.uk> wrote:

> Have you tried it? I suspect your issue will be with the process of
> reloading the external file rather than consuming it once loaded.
>
> What are you using the external file for? There may be other ways also.
> E.g. external file fields don't play nice with Solr Cloud.
>
> Upayavira
>
> On Mon, Sep 7, 2015, at 07:05 AM, Aman Tandon wrote:
> > Hi,
> >
> > How much ids information can I define in External File? Currently I am
> > having the 100 Million records in my index.
> >
> > With Regards
> > Aman Tandon
>

Maximum Number of entires in External Field?

2015-09-07 Thread Aman Tandon

Hi,

How much ids information can I define in External File? Currently I am
having the 100 Million records in my index.

With Regards
Aman Tandon

Re: Maximum Number of entires in External Field?

2015-09-07 Thread Aman Tandon

Any suggestions?

With Regards
Aman Tandon

On Mon, Sep 7, 2015 at 1:07 PM, Aman Tandon <amantandon...@gmail.com> wrote:

> Hi Upayavira,
>
> Have you tried it?
>
>
> No
>
> E.g. external file fields don't play nice with Solr Cloud
>
>
> We are not using Solr Cloud.
>
>
>> What are you using the external file for?
>
>
> We are doing the boosting in the search result which are *having price by
> 1.2* &  *country is India by 1.1*. We are doing by using the boosting
> parameter in conjucation with query & map function e.g. 
> *=map(query({!dismax
> qf=hasPrice v='yes' pf=''},0),1,1,1,1)*
>
> This is being done with 5/6 parameters. And I am hoping it will increase
> query time. So I am planning to make the single score and populate it in
> external file field. And this might reduce some time.
>
> Just to mention we are doing incremental updates after every 10 minutes.
>
> With Regards
> Aman Tandon
>
> On Mon, Sep 7, 2015 at 12:53 PM, Upayavira <u...@odoko.co.uk> wrote:
>
>> Have you tried it? I suspect your issue will be with the process of
>> reloading the external file rather than consuming it once loaded.
>>
>> What are you using the external file for? There may be other ways also.
>> E.g. external file fields don't play nice with Solr Cloud.
>>
>> Upayavira
>>
>> On Mon, Sep 7, 2015, at 07:05 AM, Aman Tandon wrote:
>> > Hi,
>> >
>> > How much ids information can I define in External File? Currently I am
>> > having the 100 Million records in my index.
>> >
>> > With Regards
>> > Aman Tandon
>>
>
>

Re: Maximum Number of entires in External Field?

2015-09-07 Thread Aman Tandon

I am currently doing boosting for 5-7 things. will  it work great with this
too?

With Regards
Aman Tandon

On Mon, Sep 7, 2015 at 11:42 PM, Upayavira <u...@odoko.co.uk> wrote:

> External file field would work, but requires a full import of the
> external file field every time you change a single entry, which is
> pretty extreme.
>
> I've tested out "score joins" which seemed to perform very well and
> achieved the same effect, but using another core, rather than an
> external file.
>
> Thus:
>
> {!join score=max fromIndex=prices from=id to=id}{!boost b=price}*:*
>
> seemed to do the job of using the price as a boost. Of course you could
> extend this like so:
>
> q={!join score=max fromIndex=prices from=id to=id}{!boost b=$b}*:*
> b=sqrt(price)
>
> or such things to make the price a more reasonable value.
>
> Upayavira
>
> On Mon, Sep 7, 2015, at 06:21 PM, Aman Tandon wrote:
> > Any suggestions?
> >
> > With Regards
> > Aman Tandon
> >
> > On Mon, Sep 7, 2015 at 1:07 PM, Aman Tandon <amantandon...@gmail.com>
> > wrote:
> >
> > > Hi Upayavira,
> > >
> > > Have you tried it?
> > >
> > >
> > > No
> > >
> > > E.g. external file fields don't play nice with Solr Cloud
> > >
> > >
> > > We are not using Solr Cloud.
> > >
> > >
> > >> What are you using the external file for?
> > >
> > >
> > > We are doing the boosting in the search result which are *having price
> by
> > > 1.2* &  *country is India by 1.1*. We are doing by using the boosting
> > > parameter in conjucation with query & map function e.g.
> *=map(query({!dismax
> > > qf=hasPrice v='yes' pf=''},0),1,1,1,1)*
> > >
> > > This is being done with 5/6 parameters. And I am hoping it will
> increase
> > > query time. So I am planning to make the single score and populate it
> in
> > > external file field. And this might reduce some time.
> > >
> > > Just to mention we are doing incremental updates after every 10
> minutes.
> > >
> > > With Regards
> > > Aman Tandon
> > >
> > > On Mon, Sep 7, 2015 at 12:53 PM, Upayavira <u...@odoko.co.uk> wrote:
> > >
> > >> Have you tried it? I suspect your issue will be with the process of
> > >> reloading the external file rather than consuming it once loaded.
> > >>
> > >> What are you using the external file for? There may be other ways
> also.
> > >> E.g. external file fields don't play nice with Solr Cloud.
> > >>
> > >> Upayavira
> > >>
> > >> On Mon, Sep 7, 2015, at 07:05 AM, Aman Tandon wrote:
> > >> > Hi,
> > >> >
> > >> > How much ids information can I define in External File? Currently I
> am
> > >> > having the 100 Million records in my index.
> > >> >
> > >> > With Regards
> > >> > Aman Tandon
> > >>
> > >
> > >
>

Re: How to configure solr to not bind at 8983

2015-08-20 Thread Aman Tandon

Hi Samy,

Any particular reason to not to use the -p paratmeter to start it on
another port?
./solr start -p 9983

With Regards
Aman Tandon

On Thu, Aug 20, 2015 at 2:02 PM, Modassar Ather modather1...@gmail.com
wrote:

 I think you need to add the port number in solr.xml too under hostPort
 attribute.

 STOP.PORT is SOLR.PORT-1000 and set under SOLR_HOME/bin/solr file.
 As far as I understand this can not be changed but I am not sure.

 Regards,
 Modassar

 On Thu, Aug 20, 2015 at 11:39 AM, Samy Ateia samyat...@hotmail.de wrote:

  I changed the solr listen port in the solr.in.sh file in my solr home
  directory by setting the variable: SOLR_PORT=.
  But Solr is still trying to also listen on 8983 because it gets started
  with the -DSTOP.PORT=8983 variable.
 
  What is this -DSTOP.PORT variable for and where should I configure it?
 
  I ran the install_solr_service.sh script to setup solr and changed the
  SOLR_PORT afterwards.
 
  best regards.
 
  Samy

Re: docValues

2015-08-08 Thread Aman Tandon

Hi,


 I am seeing a significant difference in the query time after using docValue

what kind of difference, is it good or bad?

With Regards
Aman Tandon

On Sat, Aug 8, 2015 at 11:38 PM, Nagasharath sharathrayap...@gmail.com
wrote:

 I am seeing a significant difference in the query time after using
 docValue.

 I am curious to know what's happening with 'docValue' included in the
 schema

  On 07-Aug-2015, at 4:31 pm, Shawn Heisey apa...@elyograg.org wrote:
 
  On 8/7/2015 11:47 AM, naga sharathrayapati wrote:
  JVM-Memory has gone up from 3% to 17.1%
 
  In my experience, a healthy Java application (after the heap size has
  stabilized) will have a heap utilization graph where the low points are
  between 50 and 75 percent.  If the low points in heap utilization are
  consistently below 25 percent, you would be better off reducing the heap
  size and allowing the OS to use that memory instead.
 
  If you want to track heap utilization, JVM-Memory in the Solr dashboard
  is a very poor tool.  Use tools like visualvm or jconsole.
 
  https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap
 
  I need to add what I said about very low heap utilization to that wiki
 page.
 
  Thanks,
  Shawn

Re: Query ReRanking question

2015-08-05 Thread Aman Tandon

Hi,

Very-2 nice mail thread. I think many people might be facing the problem of
maintaining the relevance and recency both at the same time.

boost=max(recip(ms(NOW/HOUR,publish_date),7.889e-10,1,1),scale(query
 ($q),0,1))


Currently in our search we are using the recency without any condition. And
this one should work well with our case too.

Thanks Joel, Erick  Ravi.

But Joel I think it is good idea to apply the sorting or any function in
the last or only in the re-ranking otherwise it will affect the search
relevance. If many users like this idea, then we should work for this
feature.

With Regards
Aman Tandon

On Fri, Jan 16, 2015 at 11:23 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Ravi:

 Yep, this is the standard way to have recency influence the rank rather
 than
 take over absolute ordering via a sort=date_time or similar.

 Of course how strongly the rank is influenced is more an art than a
 science
 as far as figuring out what actual constants to put in

 Best,
 Erick

 On Fri, Jan 16, 2015 at 8:03 AM, Ravi Solr ravis...@gmail.com wrote:
  As per Erick's suggestion reposting my response to the group. Joel and
  Erick Thank you very much for helping me out with the ReRanking question
 a
  while ago.
 
  I have an alternative which seems to be working better for me than
  ReRanking, can you kindly let me know of any pitfalls that you guys can
  think of about the this approach ?? Since we value relevancy  recency at
  the same time even though both are mutually exclusive, i thought maybe I
  can use the function queries to adjust the boost as follows
 
 
 boost=max(recip(ms(NOW/HOUR,publish_date),7.889e-10,1,1),scale(query($q),0,1))
 
  What I intended to do here is - if it matched a more recent doc it will
  take recency into consideration, however if the relevancy is better than
  date boost we keep relevancy. What do you guys think ??
 
  Thanks,
 
  Ravi Kiran Bhaskar
 
 
  On Mon, Sep 8, 2014 at 12:35 PM, Ravi Solr ravis...@gmail.com wrote:
 
  Joel and Erick,
 Thank you very much for explaining how the ReRanking works.
 Now
  its a bit more clear.
 
  Thanks,
 
  Ravi Kiran Bhaskar
 
  On Sun, Sep 7, 2014 at 4:45 PM, Joel Bernstein joels...@gmail.com
 wrote:
 
  Oops wrong usage pattern. It should be:
 
  1) Main query is sorted by a field (scores tracked silently in the
  background).
  2) Reranker is reRanking docs based on the score from the main query.
 
 
 
  Joel Bernstein
  Search Engineer at Heliosearch
 
 
  On Sun, Sep 7, 2014 at 4:43 PM, Joel Bernstein joels...@gmail.com
  wrote:
 
   Ok, just reviewed the code. The ReRankingQParserPlugin always tracks
 the
   scores from the main query. So this explains things. Speaking of
  explaining
   things, the ReRankingParserPlugin also works with Lucene's explain.
 So
  if
   you use debugQuery=true we should see that the score from the initial
  query
   was combined with the score from the reRankQuery, which should be 1.
  
   You have stumbled on a interesting usage pattern which I never
  considered.
   But basically what's happening is:
  
   1) Main query is sorted by score.
   2) Reranker is reRanking docs based on the score from the main query.
  
   No, worries Erick, you've taught me a lot over the past couple of
 years!
  
  
  
  
  
  
  
  
   Joel Bernstein
   Search Engineer at Heliosearch
  
  
   On Sun, Sep 7, 2014 at 11:37 AM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   Joel:
  
   I find that whenever I say something totally wrong publicly, I
   remember the correction really really well...
  
   Thanks for straightening that out!
   Erick
  
   On Sat, Sep 6, 2014 at 12:58 PM, Joel Bernstein joels...@gmail.com
 
   wrote:
This folllowing query:
   
http://localhost:8080/solr/select?q=malaysian airline
  crashrq={!rerank
reRankQuery=$rqq reRankDocs=1000}rqq=*:*sort=publish_date
descfl=headline,publish_date,score
   
Is doing the following:
   
The main query is sorted by publish_date. Then the results are
  reranked
   by
*:*, which in theory would have no effect at all.
   
The reRankQuery only uses the reRankQuery to re-rank the results.
 The
   sort
param will always apply to the main query.
   
   
   
   
   
   
   
   
   
   
   
   
Joel Bernstein
Search Engineer at Heliosearch
   
   
On Sat, Sep 6, 2014 at 2:33 PM, Ravi Solr ravis...@gmail.com
  wrote:
   
Erick,
Your idea about reversing Joel's suggestion seems to give
  the
   best
results of all the options I tried...but I cant seem to
 understand
   why. I
thought the query shown below should give irrelevant results as
   sorting by
date would throw relevancy off...but somehow its getting relevant
   results
with fair enough reverse chronology. It is as if the sort is
 applied
   after
the docs are collected and reranked (which is what I wanted). One
  more
thing that baffled me was, if I change reRankDocs from 1000 to100

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon

Hi,

I tried to query the without and with docValues, the query with docValues
was taking more time. Does it may be due to IO got involved as some data
will be in some file.

Are you sure anything else could affect your times ?


Yes I am sure. We re-indexed the whole index of 40 Million records, to
implement the docValues to improve the speed. And I somehow managed to do
the simultaneous query for with/without docValues and I am getting higher
time with docValues by approx 200ms. As far as I could see it is increasing
as no of hits are increasing.

*My configuration for docValue is:*

field name=citydv type=string docValues=true stored=true required=
false omitNorms=true multiValued=false /


With Regards
Aman Tandon

On Thu, Jul 2, 2015 at 3:15 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 So first of all,
 DocValues is a strategy to store on the disk ( or in memory) the
 Un-inverted index for the field of interests.
 This has been done to SPEED UP the faceting calculus using the fc
 algorithm, and improve the memory usage.
 It is really weird that this is the cause of a degrading of performances.

 Building the DocValues should improve the query time to build facets,
 increasing the indexing time.
 Are you sure anything else could affect your times ?

 let's try to help you out !

 2015-07-02 4:19 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I tried to use the docValues to reduce the search time, but when I am
 using
  the default format for docValues it is taking more time as compared to
  normal faceting technique (without docValues).
 
  Should I go for Memory format or there is something missing?
 
  *Note:-* I am doing the indexing at every 10 minutes and I am using solr
  4.8.1
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon

Anything wrong?

With Regards
Aman Tandon

On Thu, Jul 2, 2015 at 4:19 PM, Aman Tandon amantandon...@gmail.com wrote:

 Hi,

 I tried to query the without and with docValues, the query with docValues
 was taking more time. Does it may be due to IO got involved as some data
 will be in some file.

 Are you sure anything else could affect your times ?


 Yes I am sure. We re-indexed the whole index of 40 Million records, to
 implement the docValues to improve the speed. And I somehow managed to do
 the simultaneous query for with/without docValues and I am getting higher
 time with docValues by approx 200ms. As far as I could see it is increasing
 as no of hits are increasing.

 *My configuration for docValue is:*

 field name=citydv type=string docValues=true stored=true required
 =false omitNorms=true multiValued=false /


 With Regards
 Aman Tandon

 On Thu, Jul 2, 2015 at 3:15 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

 So first of all,
 DocValues is a strategy to store on the disk ( or in memory) the
 Un-inverted index for the field of interests.
 This has been done to SPEED UP the faceting calculus using the fc
 algorithm, and improve the memory usage.
 It is really weird that this is the cause of a degrading of performances.

 Building the DocValues should improve the query time to build facets,
 increasing the indexing time.
 Are you sure anything else could affect your times ?

 let's try to help you out !

 2015-07-02 4:19 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I tried to use the docValues to reduce the search time, but when I am
 using
  the default format for docValues it is taking more time as compared to
  normal faceting technique (without docValues).
 
  Should I go for Memory format or there is something missing?
 
  *Note:-* I am doing the indexing at every 10 minutes and I am using solr
  4.8.1
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: DocValues: Which format is better Default or Memory?

2015-07-02 Thread Aman Tandon

So should I use Memory format?

With Regards
Aman Tandon

On Thu, Jul 2, 2015 at 9:20 PM, Toke Eskildsen t...@statsbiblioteket.dk
wrote:

 Alessandro Benedetti benedetti.ale...@gmail.com wrote:
  DocValues is a strategy to store on the disk ( or in memory) the
  Un-inverted index for the field of interests.

 True.

  This has been done to SPEED UP the faceting calculus using the fc
  algorithm, and improve the memory usage.

 Part of the reason was to speed up the _startup_ time for faceting.

 This is not the first time I read about people getting poorer
 query-performance with DocValues. It does make sense: DocValues in the
 index means that they compete with other files for disk caching and even
 when they are fully cached, the UnInverted structure has a speed edge due
 to being directly accessible as standard on-heap memory structures.

 The difference is likely to vary a great deal depending on concrete corpus
  hardware.

 - Toke Eskildsen

DocValues: Which format is better Default or Memory?

2015-07-01 Thread Aman Tandon

Hi,

I tried to use the docValues to reduce the search time, but when I am using
the default format for docValues it is taking more time as compared to
normal faceting technique (without docValues).

Should I go for Memory format or there is something missing?

*Note:-* I am doing the indexing at every 10 minutes and I am using solr
4.8.1

With Regards
Aman Tandon

Re: Help: Problem in customized token filter

2015-06-19 Thread Aman Tandon

Steve,

Thank you thank you so much. You guys are awesome.

Steve how can i learn more about the lucene indexing process in more
detail. e.g. after we send documents for indexing which function calls till
the doc actually store in index files.

I will be thankful to you. If you guide me here.

With Regards
Aman Tandon

On Fri, Jun 19, 2015 at 10:48 AM, Steve Rowe sar...@gmail.com wrote:

 Aman,

 Solr uses the same Token filter instances over and over, calling reset()
 before sending each document through.  Your code sets “exhausted to true
 and then never sets it back to false, so the next time the token filter
 instance is used, its “exhausted value is still true, so no input stream
 tokens are concatenated ever again.

 Does that make sense?

 Steve
 www.lucidworks.com

  On Jun 19, 2015, at 1:10 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 
  Hi Steve,
 
 
  you never set exhausted to false, and when the filter got reused, *it
  incorrectly carried state from the previous document.*
 
 
  Thanks for replying, but I am not able to understand this.
 
  With Regards
  Aman Tandon
 
  On Fri, Jun 19, 2015 at 10:25 AM, Steve Rowe sar...@gmail.com wrote:
 
  Hi Aman,
 
  The admin UI screenshot you linked to is from an older version of Solr -
  what version are you using?
 
  Lots of extraneous angle brackets and asterisks got into your email and
  made for a bunch of cleanup work before I could read or edit it.  In the
  future, please put your code somewhere people can easily read it and
  copy/paste it into an editor: into a github gist or on a paste service,
 etc.
 
  Looks to me like your use of “exhausted” is unnecessary, and is likely
 the
  cause of the problem you saw (only one document getting processed): you
  never set exhausted to false, and when the filter got reused, it
  incorrectly carried state from the previous document.
 
  Here’s a simpler version that’s hopefully more correct and more
 efficient
  (2 fewer copies from the StringBuilder to the final token).  Note: I
 didn’t
  test it:
 
 https://gist.github.com/sarowe/9b9a52b683869ced3a17
 
  Steve
  www.lucidworks.com
 
  On Jun 18, 2015, at 11:33 AM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  Please help, what wrong I am doing here. please guide me.
 
  With Regards
  Aman Tandon
 
  On Thu, Jun 18, 2015 at 4:51 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  Hi,
 
  I created a *token concat filter* to concat all the tokens from token
  stream. It creates the concatenated token as expected.
 
  But when I am posting the xml containing more than 30,000 documents,
  then
  only first document is having the data of that field.
 
  *Schema:*
 
  *field name=titlex type=text indexed=true stored=false
  required=false omitNorms=false multiValued=false /*
 
 
 
 
 
 
  *fieldType name=text class=solr.TextField
  positionIncrementGap=100*
  *  analyzer type=index*
  *charFilter class=solr.HTMLStripCharFilterFactory/*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.ShingleFilterFactory maxShingleSize=3
  outputUnigrams=true tokenSeparator=/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=stemmed_synonyms_text_prime_ex_index.txt ignoreCase=true
  expand=true/*
  *  /analyzer*
  *  analyzer type=query*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=synonyms.txt ignoreCase=true expand=true/*
  *filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords_text_prime_search.txt
  enablePositionIncrements=true /*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *  /analyzer**/fieldType*
 
 
  Please help me, The code for the filter is as follows, please take a
  look.
 
  Here is the picture of what filter is doing
  http://i.imgur.com/THCsYtG.png?1
 
  The code of concat filter is :
 
  *package com.xyz.analysis.concat;*
 
  *import java.io.IOException;*
 
 
  *import org.apache.lucene.analysis.TokenFilter;*
 
  *import org.apache.lucene.analysis.TokenStream;*
 
  *import
 org.apache.lucene.analysis.tokenattributes.CharTermAttribute;*
 
  *import

Help: Problem in customized token filter

2015-06-18 Thread Aman Tandon

Hi,

I created a *token concat filter* to concat all the tokens from token
stream. It creates the concatenated token as expected.

But when I am posting the xml containing more than 30,000 documents, then
only first document is having the data of that field.

*Schema:*

*field name=titlex type=text indexed=true stored=false
 required=false omitNorms=false multiValued=false /*






 *fieldType name=text class=solr.TextField positionIncrementGap=100*
 *  analyzer type=index*
 *charFilter class=solr.HTMLStripCharFilterFactory/*
 *tokenizer class=solr.StandardTokenizerFactory/*
 *filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/*
 *filter class=solr.LowerCaseFilterFactory/*
 *filter class=solr.ShingleFilterFactory maxShingleSize=3
 outputUnigrams=true tokenSeparator=/*
 *filter class=solr.SnowballPorterFilterFactory
 language=English protected=protwords.txt/*
 *filter
 class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
 *filter class=solr.SynonymFilterFactory
 synonyms=stemmed_synonyms_text_prime_ex_index.txt ignoreCase=true
 expand=true/*
 *  /analyzer*
 *  analyzer type=query*
 *tokenizer class=solr.StandardTokenizerFactory/*
 *filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/*
 *filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords_text_prime_search.txt enablePositionIncrements=true /*
 *filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/*
 *filter class=solr.LowerCaseFilterFactory/*
 *filter class=solr.SnowballPorterFilterFactory
 language=English protected=protwords.txt/*
 *filter
 class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
 *  /analyzer**/fieldType*


Please help me, The code for the filter is as follows, please take a look.

Here is the picture of what filter is doing
http://i.imgur.com/THCsYtG.png?1

The code of concat filter is :

*package com.xyz.analysis.concat;*

 *import java.io.IOException;*


 *import org.apache.lucene.analysis.TokenFilter;*

 *import org.apache.lucene.analysis.TokenStream;*

 *import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;*

 *import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;*

 *import
 org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;*

 *import org.apache.lucene.analysis.tokenattributes.TypeAttribute;*


 *public class ConcatenateWordsFilter extends TokenFilter {*


 *  private CharTermAttribute charTermAttribute =
 addAttribute(CharTermAttribute.class);*

 *  private OffsetAttribute offsetAttribute =
 addAttribute(OffsetAttribute.class);*

 *  PositionIncrementAttribute posIncr =
 addAttribute(PositionIncrementAttribute.class);*

 *  TypeAttribute typeAtrr = addAttribute(TypeAttribute.class);*


 *  private StringBuilder stringBuilder = new StringBuilder();*

 *  private boolean exhausted = false;*


 *  /***

 *   * Creates a new ConcatenateWordsFilter*

 *   * @param input TokenStream that will be filtered*

 *   */*

 *  public ConcatenateWordsFilter(TokenStream input) {*

 *super(input);*

 *  }*


 *  /***

 *   * {@inheritDoc}*

 *   */*

 *  @Override*

 *  public final boolean incrementToken() throws IOException {*

 *while (!exhausted  input.incrementToken()) {*

 *  char terms[] = charTermAttribute.buffer();*

 *  int termLength = charTermAttribute.length();*

 *  if(typeAtrr.type().equals(ALPHANUM)){*

 * stringBuilder.append(terms, 0, termLength);*

 *  }*

 *  charTermAttribute.copyBuffer(terms, 0, termLength);*

 *  return true;*

 *}*


 *if (!exhausted) {*

 *  exhausted = true;*

 *  String sb = stringBuilder.toString();*

 *  System.err.println(The Data got is +sb);*

 *  int sbLength = sb.length();*

 *  //posIncr.setPositionIncrement(0);*

 *  charTermAttribute.copyBuffer(sb.toCharArray(), 0, sbLength);*

 *  offsetAttribute.setOffset(offsetAttribute.startOffset(),
 offsetAttribute.startOffset()+sbLength);*

 *  stringBuilder.setLength(0);*

 *  //typeAtrr.setType(CONCATENATED);*

 *  return true;*

 *}*

 *return false;*

 *  }*

 *}*



With Regards
Aman Tandon

Re: Help: Problem in customized token filter

2015-06-18 Thread Aman Tandon

Please help, what wrong I am doing here. please guide me.

With Regards
Aman Tandon

On Thu, Jun 18, 2015 at 4:51 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi,

 I created a *token concat filter* to concat all the tokens from token
 stream. It creates the concatenated token as expected.

 But when I am posting the xml containing more than 30,000 documents, then
 only first document is having the data of that field.

 *Schema:*

 *field name=titlex type=text indexed=true stored=false
 required=false omitNorms=false multiValued=false /*






 *fieldType name=text class=solr.TextField
 positionIncrementGap=100*
 *  analyzer type=index*
 *charFilter class=solr.HTMLStripCharFilterFactory/*
 *tokenizer class=solr.StandardTokenizerFactory/*
 *filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/*
 *filter class=solr.LowerCaseFilterFactory/*
 *filter class=solr.ShingleFilterFactory maxShingleSize=3
 outputUnigrams=true tokenSeparator=/*
 *filter class=solr.SnowballPorterFilterFactory
 language=English protected=protwords.txt/*
 *filter
 class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
 *filter class=solr.SynonymFilterFactory
 synonyms=stemmed_synonyms_text_prime_ex_index.txt ignoreCase=true
 expand=true/*
 *  /analyzer*
 *  analyzer type=query*
 *tokenizer class=solr.StandardTokenizerFactory/*
 *filter class=solr.SynonymFilterFactory
 synonyms=synonyms.txt ignoreCase=true expand=true/*
 *filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords_text_prime_search.txt enablePositionIncrements=true /*
 *filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/*
 *filter class=solr.LowerCaseFilterFactory/*
 *filter class=solr.SnowballPorterFilterFactory
 language=English protected=protwords.txt/*
 *filter
 class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
 *  /analyzer**/fieldType*


 Please help me, The code for the filter is as follows, please take a look.

 Here is the picture of what filter is doing
 http://i.imgur.com/THCsYtG.png?1

 The code of concat filter is :

 *package com.xyz.analysis.concat;*

 *import java.io.IOException;*


 *import org.apache.lucene.analysis.TokenFilter;*

 *import org.apache.lucene.analysis.TokenStream;*

 *import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;*

 *import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;*

 *import
 org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;*

 *import org.apache.lucene.analysis.tokenattributes.TypeAttribute;*


 *public class ConcatenateWordsFilter extends TokenFilter {*


 *  private CharTermAttribute charTermAttribute =
 addAttribute(CharTermAttribute.class);*

 *  private OffsetAttribute offsetAttribute =
 addAttribute(OffsetAttribute.class);*

 *  PositionIncrementAttribute posIncr =
 addAttribute(PositionIncrementAttribute.class);*

 *  TypeAttribute typeAtrr = addAttribute(TypeAttribute.class);*


 *  private StringBuilder stringBuilder = new StringBuilder();*

 *  private boolean exhausted = false;*


 *  /***

 *   * Creates a new ConcatenateWordsFilter*

 *   * @param input TokenStream that will be filtered*

 *   */*

 *  public ConcatenateWordsFilter(TokenStream input) {*

 *super(input);*

 *  }*


 *  /***

 *   * {@inheritDoc}*

 *   */*

 *  @Override*

 *  public final boolean incrementToken() throws IOException {*

 *while (!exhausted  input.incrementToken()) {*

 *  char terms[] = charTermAttribute.buffer();*

 *  int termLength = charTermAttribute.length();*

 *  if(typeAtrr.type().equals(ALPHANUM)){*

 * stringBuilder.append(terms, 0, termLength);*

 *  }*

 *  charTermAttribute.copyBuffer(terms, 0, termLength);*

 *  return true;*

 *}*


 *if (!exhausted) {*

 *  exhausted = true;*

 *  String sb = stringBuilder.toString();*

 *  System.err.println(The Data got is +sb);*

 *  int sbLength = sb.length();*

 *  //posIncr.setPositionIncrement(0);*

 *  charTermAttribute.copyBuffer(sb.toCharArray(), 0, sbLength);*

 *  offsetAttribute.setOffset(offsetAttribute.startOffset(),
 offsetAttribute.startOffset()+sbLength);*

 *  stringBuilder.setLength(0);*

 *  //typeAtrr.setType(CONCATENATED);*

 *  return true;*

 *}*

 *return false;*

 *  }*

 *}*



 With Regards
 Aman Tandon

Re: Help: Problem in customized token filter

2015-06-18 Thread Aman Tandon

Hi Steve,


  you never set exhausted to false, and when the filter got reused, *it
 incorrectly carried state from the previous document.*


Thanks for replying, but I am not able to understand this.

With Regards
Aman Tandon

On Fri, Jun 19, 2015 at 10:25 AM, Steve Rowe sar...@gmail.com wrote:

 Hi Aman,

 The admin UI screenshot you linked to is from an older version of Solr -
 what version are you using?

 Lots of extraneous angle brackets and asterisks got into your email and
 made for a bunch of cleanup work before I could read or edit it.  In the
 future, please put your code somewhere people can easily read it and
 copy/paste it into an editor: into a github gist or on a paste service, etc.

 Looks to me like your use of “exhausted” is unnecessary, and is likely the
 cause of the problem you saw (only one document getting processed): you
 never set exhausted to false, and when the filter got reused, it
 incorrectly carried state from the previous document.

 Here’s a simpler version that’s hopefully more correct and more efficient
 (2 fewer copies from the StringBuilder to the final token).  Note: I didn’t
 test it:

 https://gist.github.com/sarowe/9b9a52b683869ced3a17

 Steve
 www.lucidworks.com

  On Jun 18, 2015, at 11:33 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 
  Please help, what wrong I am doing here. please guide me.
 
  With Regards
  Aman Tandon
 
  On Thu, Jun 18, 2015 at 4:51 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  Hi,
 
  I created a *token concat filter* to concat all the tokens from token
  stream. It creates the concatenated token as expected.
 
  But when I am posting the xml containing more than 30,000 documents,
 then
  only first document is having the data of that field.
 
  *Schema:*
 
  *field name=titlex type=text indexed=true stored=false
  required=false omitNorms=false multiValued=false /*
 
 
 
 
 
 
  *fieldType name=text class=solr.TextField
  positionIncrementGap=100*
  *  analyzer type=index*
  *charFilter class=solr.HTMLStripCharFilterFactory/*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.ShingleFilterFactory maxShingleSize=3
  outputUnigrams=true tokenSeparator=/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=stemmed_synonyms_text_prime_ex_index.txt ignoreCase=true
  expand=true/*
  *  /analyzer*
  *  analyzer type=query*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=synonyms.txt ignoreCase=true expand=true/*
  *filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords_text_prime_search.txt
 enablePositionIncrements=true /*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *  /analyzer**/fieldType*
 
 
  Please help me, The code for the filter is as follows, please take a
 look.
 
  Here is the picture of what filter is doing
  http://i.imgur.com/THCsYtG.png?1
 
  The code of concat filter is :
 
  *package com.xyz.analysis.concat;*
 
  *import java.io.IOException;*
 
 
  *import org.apache.lucene.analysis.TokenFilter;*
 
  *import org.apache.lucene.analysis.TokenStream;*
 
  *import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;*
 
  *import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;*
 
  *import
 
 org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;*
 
  *import org.apache.lucene.analysis.tokenattributes.TypeAttribute;*
 
 
  *public class ConcatenateWordsFilter extends TokenFilter {*
 
 
  *  private CharTermAttribute charTermAttribute =
  addAttribute(CharTermAttribute.class);*
 
  *  private OffsetAttribute offsetAttribute =
  addAttribute(OffsetAttribute.class);*
 
  *  PositionIncrementAttribute posIncr =
  addAttribute(PositionIncrementAttribute.class);*
 
  *  TypeAttribute typeAtrr = addAttribute(TypeAttribute.class);*
 
 
  *  private StringBuilder stringBuilder = new StringBuilder();*
 
  *  private boolean exhausted = false;*
 
 
  *  /***
 
  *   * Creates a new ConcatenateWordsFilter*
 
  *   * @param input TokenStream that will be filtered*
 
  *   */*
 
  *  public ConcatenateWordsFilter(TokenStream input) {*
 
  *super(input

Re: Help: Problem in customized token filter

2015-06-18 Thread Aman Tandon

Yes I just saw.

With Regards
Aman Tandon

On Fri, Jun 19, 2015 at 10:39 AM, Steve Rowe sar...@gmail.com wrote:

 Aman,

 My version won’t produce anything at all, since incrementToken() always
 returns false…

 I updated the gist (at the same URL) to fix the problem by returning true
 from incrementToken() once and then false until reset() is called.  It also
 handles the case when the concatenated token is zero length by not emitting
 a token.

 Steve
 www.lucidworks.com

  On Jun 19, 2015, at 12:55 AM, Steve Rowe sar...@gmail.com wrote:
 
  Hi Aman,
 
  The admin UI screenshot you linked to is from an older version of Solr -
 what version are you using?
 
  Lots of extraneous angle brackets and asterisks got into your email and
 made for a bunch of cleanup work before I could read or edit it.  In the
 future, please put your code somewhere people can easily read it and
 copy/paste it into an editor: into a github gist or on a paste service, etc.
 
  Looks to me like your use of “exhausted” is unnecessary, and is likely
 the cause of the problem you saw (only one document getting processed): you
 never set exhausted to false, and when the filter got reused, it
 incorrectly carried state from the previous document.
 
  Here’s a simpler version that’s hopefully more correct and more
 efficient (2 fewer copies from the StringBuilder to the final token).
 Note: I didn’t test it:
 
 https://gist.github.com/sarowe/9b9a52b683869ced3a17
 
  Steve
  www.lucidworks.com
 
  On Jun 18, 2015, at 11:33 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 
  Please help, what wrong I am doing here. please guide me.
 
  With Regards
  Aman Tandon
 
  On Thu, Jun 18, 2015 at 4:51 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  Hi,
 
  I created a *token concat filter* to concat all the tokens from token
  stream. It creates the concatenated token as expected.
 
  But when I am posting the xml containing more than 30,000 documents,
 then
  only first document is having the data of that field.
 
  *Schema:*
 
  *field name=titlex type=text indexed=true stored=false
  required=false omitNorms=false multiValued=false /*
 
 
 
 
 
 
  *fieldType name=text class=solr.TextField
  positionIncrementGap=100*
  *  analyzer type=index*
  *charFilter class=solr.HTMLStripCharFilterFactory/*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.ShingleFilterFactory maxShingleSize=3
  outputUnigrams=true tokenSeparator=/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=stemmed_synonyms_text_prime_ex_index.txt ignoreCase=true
  expand=true/*
  *  /analyzer*
  *  analyzer type=query*
  *tokenizer class=solr.StandardTokenizerFactory/*
  *filter class=solr.SynonymFilterFactory
  synonyms=synonyms.txt ignoreCase=true expand=true/*
  *filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords_text_prime_search.txt
 enablePositionIncrements=true /*
  *filter class=solr.WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/*
  *filter class=solr.LowerCaseFilterFactory/*
  *filter class=solr.SnowballPorterFilterFactory
  language=English protected=protwords.txt/*
  *filter
  class=com.xyz.analysis.concat.ConcatenateWordsFilterFactory/*
  *  /analyzer**/fieldType*
 
 
  Please help me, The code for the filter is as follows, please take a
 look.
 
  Here is the picture of what filter is doing
  http://i.imgur.com/THCsYtG.png?1
 
  The code of concat filter is :
 
  *package com.xyz.analysis.concat;*
 
  *import java.io.IOException;*
 
 
  *import org.apache.lucene.analysis.TokenFilter;*
 
  *import org.apache.lucene.analysis.TokenStream;*
 
  *import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;*
 
  *import org.apache.lucene.analysis.tokenattributes.OffsetAttribute;*
 
  *import
 
 org.apache.lucene.analysis.tokenattributes.PositionIncrementAttribute;*
 
  *import org.apache.lucene.analysis.tokenattributes.TypeAttribute;*
 
 
  *public class ConcatenateWordsFilter extends TokenFilter {*
 
 
  *  private CharTermAttribute charTermAttribute =
  addAttribute(CharTermAttribute.class);*
 
  *  private OffsetAttribute offsetAttribute =
  addAttribute(OffsetAttribute.class);*
 
  *  PositionIncrementAttribute posIncr =
  addAttribute(PositionIncrementAttribute.class);*
 
  *  TypeAttribute typeAtrr = addAttribute(TypeAttribute.class);*
 
 
  *  private StringBuilder stringBuilder = new StringBuilder

Re: How to create concatenated token

2015-06-18 Thread Aman Tandon

Hi Erick,

In that issue you forwarded to me, they want to make one token from all
tokens received from token stream but in my case I want to keep the tokens
same and create and extra new token which is concat of all the tokens.


 I'd guess, is the case
 here. I mean do you really want to concatenate 50 tokens?

We are applying it on *title field* of product  so max length can be 10 I
guess and that too will be in rare case.

With Regards
Aman Tandon

On Wed, Jun 17, 2015 at 7:16 PM, Erick Erickson erickerick...@gmail.com
wrote:

 If you used the JIRA I linked, vote for it, add any improvements etc.
 Anyone can attach a patch to a JIRA, you just have to create a login.

 That said, this may be too rare a use-case to deal with. I just thought
 of shingling which I should have suggested before that will work for
 concatenating small numbers of tokens which, I'd guess, is the case
 here. I mean do you really want to concatenate 50 tokens?

 Best,
 Erick

 On Wed, Jun 17, 2015 at 12:07 AM, Aman Tandon amantandon...@gmail.com
 wrote:
  Dear Erick,
 
  e.g. Solr training
  *Porter:-*  solr  train
Position 1 2
  *Concatenated :-*   solr  train
 solrtrain
 Position 1  2
 
 
  I did implemented the filter as per my requirement. Thank you so much for
  your help and guidance. So how could I contribute it to the solr.
 
  With Regards
  Aman Tandon
 
  On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  Hi Erick,
 
  Thank you so much, it will be helpful for me to learn how to save the
  state of token. I has no idea of how to save state of previous tokens
 due
  to this it was difficult to generate a concatenated token in the last.
 
  So is there anything should I read to learn more about it.
 
  With Regards
  Aman Tandon
 
  On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  I really question the premise, but have a look at:
  https://issues.apache.org/jira/browse/SOLR-7193
 
  Note that this is not committed and I haven't reviewed
  it so I don't have anything to say about that. And you'd
  have to implement it as a custom Filter.
 
  Best,
  Erick
 
  On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com
  wrote:
   Hi,
  
   Any guesses, how could I achieve this behaviour.
  
   With Regards
   Aman Tandon
  
   On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon 
 amantandon...@gmail.com
   wrote:
  
   e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
  training)
  
  
   typo error
   e.g. Intent for solr training: fq=id:(234 456 545) title:(solr
  training)
  
   With Regards
   Aman Tandon
  
   On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon 
 amantandon...@gmail.com
   wrote:
  
   We has some business logic to search the user query in user
 intent
  or
   finding the exact matching products.
  
   e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
  training)
  
   As we can see it is phrase query so it will took more time than the
   single stemmed token query. There are also 5-7 words phrase query.
 So
  we
   want to reduce the search time by implementing this feature.
  
   With Regards
   Aman Tandon
  
   On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
   benedetti.ale...@gmail.com wrote:
  
   Can I ask you why you need to concatenate the tokens ? Maybe we
 can
  find
   a
   better solution to concat all the tokens in one single big token .
   I find it difficult to understand the reasons behind tokenising,
  token
   filtering and then un-tokenizing again :)
   It would be great if you explain a little bit better what you
 would
  like
   to
   do !
  
  
   Cheers
  
   2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:
  
Hi,
   
I have a requirement to create the concatenated token of all the
  tokens
created from the last item of my analyzer chain.
   
*Suppose my analyzer chain is :*
   
   
   
   
   
* tokenizer class=solr.WhitespaceTokenizerFactory /  filter
class=solr.WordDelimiterFilterFactory catenateAll=1
   splitOnNumerics=1
preserveOriginal=1/filter
  class=solr.EdgeNGramFilterFactory
minGramSize=2 maxGramSize=15 side=front /filter
class=solr.PorterStemmerFilterFactory/*
I want to create a concatenated token plugin to add at
 concatenated
   token
along with the last token.
   
e.g. Solr training
   
*Porter:-*  solr  train
  Position 1 2
   
*Concatenated :-*   solr  train
   solrtrain
   Position 1  2
   
Please help me out. How to create custom filter for this
  requirement.
   
With Regards
Aman Tandon
   
  
  
  
   --
   --
  
   Benedetti Alessandro
   Visiting card : http://about.me/alessandro_benedetti

Contribute the Customized Phonetic Filter to Apache Solr

2015-06-18 Thread Aman Tandon

Hi,

We created the new phonetic filter, It is working great on our products,
mostly of our suppliers are Indian, it is quite helpful for us to provide
the exact result e.g.

1) rikshaw, still able to find the suppliers of rickshaw
2) telefone, still able to find the suppliers of telephone

We also analyzed our search satisfaction feedback, it improved by 13% (54%
- 67%) just after implementing the same.

And we want to contribute the same to solr, So how could I do it.

With Regards
Aman Tandon

Re: How to create concatenated token

2015-06-17 Thread Aman Tandon

Dear Erick,

e.g. Solr training
 *Porter:-*  solr  train
   Position 1 2
 *Concatenated :-*   solr  train
solrtrain
Position 1  2


I did implemented the filter as per my requirement. Thank you so much for
your help and guidance. So how could I contribute it to the solr.

With Regards
Aman Tandon

On Wed, Jun 17, 2015 at 10:14 AM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi Erick,

 Thank you so much, it will be helpful for me to learn how to save the
 state of token. I has no idea of how to save state of previous tokens due
 to this it was difficult to generate a concatenated token in the last.

 So is there anything should I read to learn more about it.

 With Regards
 Aman Tandon

 On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com
 wrote:

 I really question the premise, but have a look at:
 https://issues.apache.org/jira/browse/SOLR-7193

 Note that this is not committed and I haven't reviewed
 it so I don't have anything to say about that. And you'd
 have to implement it as a custom Filter.

 Best,
 Erick

 On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com
 wrote:
  Hi,
 
  Any guesses, how could I achieve this behaviour.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
 
  typo error
  e.g. Intent for solr training: fq=id:(234 456 545) title:(solr
 training)
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  We has some business logic to search the user query in user intent
 or
  finding the exact matching products.
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
  As we can see it is phrase query so it will took more time than the
  single stemmed token query. There are also 5-7 words phrase query. So
 we
  want to reduce the search time by implementing this feature.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
  Can I ask you why you need to concatenate the tokens ? Maybe we can
 find
  a
  better solution to concat all the tokens in one single big token .
  I find it difficult to understand the reasons behind tokenising,
 token
  filtering and then un-tokenizing again :)
  It would be great if you explain a little bit better what you would
 like
  to
  do !
 
 
  Cheers
 
  2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
   Hi,
  
   I have a requirement to create the concatenated token of all the
 tokens
   created from the last item of my analyzer chain.
  
   *Suppose my analyzer chain is :*
  
  
  
  
  
   * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
   class=solr.WordDelimiterFilterFactory catenateAll=1
  splitOnNumerics=1
   preserveOriginal=1/filter
 class=solr.EdgeNGramFilterFactory
   minGramSize=2 maxGramSize=15 side=front /filter
   class=solr.PorterStemmerFilterFactory/*
   I want to create a concatenated token plugin to add at concatenated
  token
   along with the last token.
  
   e.g. Solr training
  
   *Porter:-*  solr  train
 Position 1 2
  
   *Concatenated :-*   solr  train
  solrtrain
  Position 1  2
  
   Please help me out. How to create custom filter for this
 requirement.
  
   With Regards
   Aman Tandon
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England

How to create concatenated token

2015-06-16 Thread Aman Tandon

Hi,

I have a requirement to create the concatenated token of all the tokens
created from the last item of my analyzer chain.

*Suppose my analyzer chain is :*





* tokenizer class=solr.WhitespaceTokenizerFactory /  filter
class=solr.WordDelimiterFilterFactory catenateAll=1 splitOnNumerics=1
preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory
minGramSize=2 maxGramSize=15 side=front /filter
class=solr.PorterStemmerFilterFactory/*
I want to create a concatenated token plugin to add at concatenated token
along with the last token.

e.g. Solr training

*Porter:-*  solr  train
  Position 1 2

*Concatenated :-*   solr  train
   solrtrain
   Position 1  2

Please help me out. How to create custom filter for this requirement.

With Regards
Aman Tandon

Re: How to create concatenated token

2015-06-16 Thread Aman Tandon


 e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training)


typo error
e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training)

With Regards
Aman Tandon

On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
wrote:

 We has some business logic to search the user query in user intent or
 finding the exact matching products.

 e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training)

 As we can see it is phrase query so it will took more time than the single
 stemmed token query. There are also 5-7 words phrase query. So we want to
 reduce the search time by implementing this feature.

 With Regards
 Aman Tandon

 On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

 Can I ask you why you need to concatenate the tokens ? Maybe we can find a
 better solution to concat all the tokens in one single big token .
 I find it difficult to understand the reasons behind tokenising, token
 filtering and then un-tokenizing again :)
 It would be great if you explain a little bit better what you would like
 to
 do !


 Cheers

 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I have a requirement to create the concatenated token of all the tokens
  created from the last item of my analyzer chain.
 
  *Suppose my analyzer chain is :*
 
 
 
 
 
  * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
  class=solr.WordDelimiterFilterFactory catenateAll=1
 splitOnNumerics=1
  preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory
  minGramSize=2 maxGramSize=15 side=front /filter
  class=solr.PorterStemmerFilterFactory/*
  I want to create a concatenated token plugin to add at concatenated
 token
  along with the last token.
 
  e.g. Solr training
 
  *Porter:-*  solr  train
Position 1 2
 
  *Concatenated :-*   solr  train
 solrtrain
 Position 1  2
 
  Please help me out. How to create custom filter for this requirement.
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: How to create concatenated token

2015-06-16 Thread Aman Tandon

We has some business logic to search the user query in user intent or
finding the exact matching products.

e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training)

As we can see it is phrase query so it will took more time than the single
stemmed token query. There are also 5-7 words phrase query. So we want to
reduce the search time by implementing this feature.

With Regards
Aman Tandon

On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 Can I ask you why you need to concatenate the tokens ? Maybe we can find a
 better solution to concat all the tokens in one single big token .
 I find it difficult to understand the reasons behind tokenising, token
 filtering and then un-tokenizing again :)
 It would be great if you explain a little bit better what you would like to
 do !


 Cheers

 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I have a requirement to create the concatenated token of all the tokens
  created from the last item of my analyzer chain.
 
  *Suppose my analyzer chain is :*
 
 
 
 
 
  * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
  class=solr.WordDelimiterFilterFactory catenateAll=1
 splitOnNumerics=1
  preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory
  minGramSize=2 maxGramSize=15 side=front /filter
  class=solr.PorterStemmerFilterFactory/*
  I want to create a concatenated token plugin to add at concatenated token
  along with the last token.
 
  e.g. Solr training
 
  *Porter:-*  solr  train
Position 1 2
 
  *Concatenated :-*   solr  train
 solrtrain
 Position 1  2
 
  Please help me out. How to create custom filter for this requirement.
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: How to create concatenated token

2015-06-16 Thread Aman Tandon

Hi,

Any guesses, how could I achieve this behaviour.

With Regards
Aman Tandon

On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com
wrote:

 e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training)


 typo error
 e.g. Intent for solr training: fq=id:(234 456 545) title:(solr training)

 With Regards
 Aman Tandon

 On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
 wrote:

 We has some business logic to search the user query in user intent or
 finding the exact matching products.

 e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr training)

 As we can see it is phrase query so it will took more time than the
 single stemmed token query. There are also 5-7 words phrase query. So we
 want to reduce the search time by implementing this feature.

 With Regards
 Aman Tandon

 On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

 Can I ask you why you need to concatenate the tokens ? Maybe we can find
 a
 better solution to concat all the tokens in one single big token .
 I find it difficult to understand the reasons behind tokenising, token
 filtering and then un-tokenizing again :)
 It would be great if you explain a little bit better what you would like
 to
 do !


 Cheers

 2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I have a requirement to create the concatenated token of all the tokens
  created from the last item of my analyzer chain.
 
  *Suppose my analyzer chain is :*
 
 
 
 
 
  * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
  class=solr.WordDelimiterFilterFactory catenateAll=1
 splitOnNumerics=1
  preserveOriginal=1/filter class=solr.EdgeNGramFilterFactory
  minGramSize=2 maxGramSize=15 side=front /filter
  class=solr.PorterStemmerFilterFactory/*
  I want to create a concatenated token plugin to add at concatenated
 token
  along with the last token.
 
  e.g. Solr training
 
  *Porter:-*  solr  train
Position 1 2
 
  *Concatenated :-*   solr  train
 solrtrain
 Position 1  2
 
  Please help me out. How to create custom filter for this requirement.
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: How to create concatenated token

2015-06-16 Thread Aman Tandon

Hi Erick,

Thank you so much, it will be helpful for me to learn how to save the state
of token. I has no idea of how to save state of previous tokens due to this
it was difficult to generate a concatenated token in the last.

So is there anything should I read to learn more about it.

With Regards
Aman Tandon

On Wed, Jun 17, 2015 at 9:20 AM, Erick Erickson erickerick...@gmail.com
wrote:

 I really question the premise, but have a look at:
 https://issues.apache.org/jira/browse/SOLR-7193

 Note that this is not committed and I haven't reviewed
 it so I don't have anything to say about that. And you'd
 have to implement it as a custom Filter.

 Best,
 Erick

 On Tue, Jun 16, 2015 at 5:55 PM, Aman Tandon amantandon...@gmail.com
 wrote:
  Hi,
 
  Any guesses, how could I achieve this behaviour.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:15 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
 
  typo error
  e.g. Intent for solr training: fq=id:(234 456 545) title:(solr
 training)
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 8:13 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
  We has some business logic to search the user query in user intent or
  finding the exact matching products.
 
  e.g. Intent for solr training: fq=id: 234, 456, 545 title(solr
 training)
 
  As we can see it is phrase query so it will took more time than the
  single stemmed token query. There are also 5-7 words phrase query. So
 we
  want to reduce the search time by implementing this feature.
 
  With Regards
  Aman Tandon
 
  On Tue, Jun 16, 2015 at 6:42 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
  Can I ask you why you need to concatenate the tokens ? Maybe we can
 find
  a
  better solution to concat all the tokens in one single big token .
  I find it difficult to understand the reasons behind tokenising, token
  filtering and then un-tokenizing again :)
  It would be great if you explain a little bit better what you would
 like
  to
  do !
 
 
  Cheers
 
  2015-06-16 13:26 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
   Hi,
  
   I have a requirement to create the concatenated token of all the
 tokens
   created from the last item of my analyzer chain.
  
   *Suppose my analyzer chain is :*
  
  
  
  
  
   * tokenizer class=solr.WhitespaceTokenizerFactory /  filter
   class=solr.WordDelimiterFilterFactory catenateAll=1
  splitOnNumerics=1
   preserveOriginal=1/filter
 class=solr.EdgeNGramFilterFactory
   minGramSize=2 maxGramSize=15 side=front /filter
   class=solr.PorterStemmerFilterFactory/*
   I want to create a concatenated token plugin to add at concatenated
  token
   along with the last token.
  
   e.g. Solr training
  
   *Porter:-*  solr  train
 Position 1 2
  
   *Concatenated :-*   solr  train
  solrtrain
  Position 1  2
  
   Please help me out. How to create custom filter for this
 requirement.
  
   With Regards
   Aman Tandon
  
 
 
 
  --
  --
 
  Benedetti Alessandro
  Visiting card : http://about.me/alessandro_benedetti
 
  Tyger, tyger burning bright
  In the forests of the night,
  What immortal hand or eye
  Could frame thy fearful symmetry?
 
  William Blake - Songs of Experience -1794 England

Re: How To: Debuging the whole indexing process

2015-05-30 Thread Aman Tandon

Please help me here

With Regards
Aman Tandon

On Sat, May 30, 2015 at 12:43 AM, Aman Tandon amantandon...@gmail.com
wrote:

 Thanks Alex, yes it for my testing to understand the code/process flow
 actually.

 Any other ideas.

 With Regards
 Aman Tandon

 On Fri, May 29, 2015 at 12:48 PM, Alexandre Rafalovitch 
 arafa...@gmail.com wrote:

 In production or in test? I assume in test.

 This level of detail usually implies some sort of Java debugger and java
 instrumentation enabled. E.g. Chronon, which is commercial but can be
 tried
 as a plugin with IntelliJ Idea full version trial.

 Regards,
 Alex
 On 29 May 2015 4:38 pm, Aman Tandon amantandon...@gmail.com wrote:

  Hi,
 
  I want to debug the whole indexing process, the life cycle of indexing
  process (each and every function call by going via function to
 function),
  from the posting of the data.xml to creation of various index files (
 _fnm,
  _fdt, etc ). So how/what should I setup and start, please help. I will
 be
  thankful to you.
 
 
 
  
  
   *add  doc  field name=title![CDATA[Aman Tandon]]/field
 field name=job_role![CDATA[Search Engineer]]/field*
 
 
   *  /doc/add*
 
 
  With Regards
  Aman Tandon

How To: Debuging the whole indexing process

2015-05-29 Thread Aman Tandon

Hi,

I want to debug the whole indexing process, the life cycle of indexing
process (each and every function call by going via function to function),
from the posting of the data.xml to creation of various index files ( _fnm,
_fdt, etc ). So how/what should I setup and start, please help. I will be
thankful to you.





 *add  doc  field name=title![CDATA[Aman Tandon]]/field
   field name=job_role![CDATA[Search Engineer]]/field*


 *  /doc/add*


With Regards
Aman Tandon

Re: docValues: Can we apply synonym

2015-05-29 Thread Aman Tandon

Hi Upayavira,

How the copyField will help in my scenario when I have to add the synonym
in docValue enable field.

With Regards
Aman Tandon

On Sat, May 30, 2015 at 1:18 AM, Upayavira u...@odoko.co.uk wrote:

 Use copyField to clone the field for faceting purposes.

 Upayavira

 On Fri, May 29, 2015, at 08:06 PM, Aman Tandon wrote:
  Hi Erick,
 
  Thanks for suggestion, We are this query parser plugin (
  *SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word
  synonym. So it does work slower than edismax that's why it is not in
  contrib right? (I am asking this question because we are using for all
  our
  searches to handle 10 multiword ice cube, icecube etc)
 
  *Moreover I thought a solution for this docValue problem*
 
  I need to make city field as *multivalued* and by this I mean i will add
  the synonym (*mumbai, bombay*) as an extra value to that field if
  present.
  Now searching operation will work fine as before.
 
  
   *field name=citymumbai/fieldfield name=citybombay/field*
 
 
  The only prob is if we have to remove the 'city alias/synonym facets'
  when
  we are providing results to the clients.
 
  *mumbai, 1000*
 
 
  With Regards
  Aman Tandon
 
  On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
   Do take time for performance testing with that parser. It can be slow
   depending on your
   data as I remember. That said it solves the problem it set out to
   solve so if it meets
   your SLAs, it can be a life-saver.
  
   Best,
   Erick
  
  
   On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti
   benedetti.ale...@gmail.com wrote:
Even if a little bit outdated, that query parser is really really
 cool to
manage synonyms !
+1 !
   
2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com:
   
Thanks chris.
   
Yes we are using it for handling multiword synonym problem.
   
With Regards
Aman Tandon
   
On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:
   
 Again, I would recommend using Nolan Lawson's
 SynonymExpandingExtendedDismaxQParserPlugin.


 http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

 -Original Message-
 From: Aman Tandon [mailto:amantandon...@gmail.com]
 Sent: Wednesday, May 27, 2015 6:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: docValues: Can we apply synonym

 Ok and what synonym processor you is talking about maybe it could
   help ?

 With Regards
 Aman Tandon

 On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  Sorry, my bad.   The synonym processor I mention works
 differently.
It's
  an extension of the EDisMax query processor and doesn't require
   field
  level synonym configs.
 
  -Original Message-
  From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org]
  Sent: Wednesday, May 27, 2015 6:12 PM
  To: solr-user@lucene.apache.org
  Subject: RE: docValues: Can we apply synonym
 
  But the query analysis isn't on a specific field, it is applied
 to
   the
  query string.
 
  -Original Message-
  From: Aman Tandon [mailto:amantandon...@gmail.com]
  Sent: Wednesday, May 27, 2015 6:08 PM
  To: solr-user@lucene.apache.org
  Subject: Re: docValues: Can we apply synonym
 
  Hi Charles,
 
  The problem here is that the docValues works only with
 primitives
   data
  type only like String, int, etc So how could we apply synonym on
  primitive data type.
 
  With Regards
  Aman Tandon
 
  On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles 
  charles.reit...@tiaa-cref.org wrote:
 
   Is there any reason you cannot apply the synonyms at query
 time?
Applying synonyms at indexing time has problems, e.g.
 polluting
   the
   term frequency for synonyms added, preventing distance
 queries,
   ...
  
   Since city names often have multiple terms, e.g. New York, Den
   Hague, etc., I would recommend using Nolan Lawson's
   SynonymExpandingExtendedDismaxQParserPlugin.   Tastes great,
 less
  filling.
  
  
   http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
  
   We found this to fix synonyms like ny for New York and
 vice
versa.
   Haven't tried it with docValues, tho.
  
   -Original Message-
   From: Aman Tandon [mailto:amantandon...@gmail.com]
   Sent: Tuesday, May 26, 2015 11:15 PM
   To: solr-user@lucene.apache.org
   Subject: Re: docValues: Can we apply synonym
  
   Yes it could be :)
  
   Anyway thanks for helping.
  
   With Regards
   Aman Tandon
  
   On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
   benedetti.ale...@gmail.com wrote:
  
I should investigate

Re: docValues: Can we apply synonym

2015-05-29 Thread Aman Tandon

Hi Erick,

Thanks for suggestion, We are this query parser plugin (
*SynonymExpandingExtendedDismaxQParserPlugin*) to manage multi-word
synonym. So it does work slower than edismax that's why it is not in
contrib right? (I am asking this question because we are using for all our
searches to handle 10 multiword ice cube, icecube etc)

*Moreover I thought a solution for this docValue problem*

I need to make city field as *multivalued* and by this I mean i will add
the synonym (*mumbai, bombay*) as an extra value to that field if present.
Now searching operation will work fine as before.


 *field name=citymumbai/fieldfield name=citybombay/field*


The only prob is if we have to remove the 'city alias/synonym facets' when
we are providing results to the clients.

*mumbai, 1000*


With Regards
Aman Tandon

On Fri, May 29, 2015 at 7:26 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Do take time for performance testing with that parser. It can be slow
 depending on your
 data as I remember. That said it solves the problem it set out to
 solve so if it meets
 your SLAs, it can be a life-saver.

 Best,
 Erick


 On Fri, May 29, 2015 at 2:35 AM, Alessandro Benedetti
 benedetti.ale...@gmail.com wrote:
  Even if a little bit outdated, that query parser is really really cool to
  manage synonyms !
  +1 !
 
  2015-05-29 1:01 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
  Thanks chris.
 
  Yes we are using it for handling multiword synonym problem.
 
  With Regards
  Aman Tandon
 
  On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles 
  charles.reit...@tiaa-cref.org wrote:
 
   Again, I would recommend using Nolan Lawson's
   SynonymExpandingExtendedDismaxQParserPlugin.
  
   http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
  
   -Original Message-
   From: Aman Tandon [mailto:amantandon...@gmail.com]
   Sent: Wednesday, May 27, 2015 6:42 PM
   To: solr-user@lucene.apache.org
   Subject: Re: docValues: Can we apply synonym
  
   Ok and what synonym processor you is talking about maybe it could
 help ?
  
   With Regards
   Aman Tandon
  
   On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles 
   charles.reit...@tiaa-cref.org wrote:
  
Sorry, my bad.   The synonym processor I mention works differently.
  It's
an extension of the EDisMax query processor and doesn't require
 field
level synonym configs.
   
-Original Message-
From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org]
Sent: Wednesday, May 27, 2015 6:12 PM
To: solr-user@lucene.apache.org
Subject: RE: docValues: Can we apply synonym
   
But the query analysis isn't on a specific field, it is applied to
 the
query string.
   
-Original Message-
From: Aman Tandon [mailto:amantandon...@gmail.com]
Sent: Wednesday, May 27, 2015 6:08 PM
To: solr-user@lucene.apache.org
Subject: Re: docValues: Can we apply synonym
   
Hi Charles,
   
The problem here is that the docValues works only with primitives
 data
type only like String, int, etc So how could we apply synonym on
primitive data type.
   
With Regards
Aman Tandon
   
On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:
   
 Is there any reason you cannot apply the synonyms at query time?
  Applying synonyms at indexing time has problems, e.g. polluting
 the
 term frequency for synonyms added, preventing distance queries,
 ...

 Since city names often have multiple terms, e.g. New York, Den
 Hague, etc., I would recommend using Nolan Lawson's
 SynonymExpandingExtendedDismaxQParserPlugin.   Tastes great, less
filling.


 http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

 We found this to fix synonyms like ny for New York and vice
  versa.
 Haven't tried it with docValues, tho.

 -Original Message-
 From: Aman Tandon [mailto:amantandon...@gmail.com]
 Sent: Tuesday, May 26, 2015 11:15 PM
 To: solr-user@lucene.apache.org
 Subject: Re: docValues: Can we apply synonym

 Yes it could be :)

 Anyway thanks for helping.

 With Regards
 Aman Tandon

 On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  I should investigate that, as usually synonyms are analysis
 stage.
  A simple way is to replace the word with all its synonyms (
  including original word), but simply using this kind of
 processor
  will change the token position and offsets, modifying the actual
  content of the
 document .
 
   I am from Bombay will become  I am from Bombay Mumbai which
  can be annoying.
  So a clever approach must be investigated.
 
  2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com
 :
 
   Okay So how could I do it with UpdateProcessors?
  
   With Regards
   Aman Tandon
  
   On Tue

Re: How To: Debuging the whole indexing process

2015-05-29 Thread Aman Tandon

Thanks Alex, yes it for my testing to understand the code/process flow
actually.

Any other ideas.

With Regards
Aman Tandon

On Fri, May 29, 2015 at 12:48 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 In production or in test? I assume in test.

 This level of detail usually implies some sort of Java debugger and java
 instrumentation enabled. E.g. Chronon, which is commercial but can be tried
 as a plugin with IntelliJ Idea full version trial.

 Regards,
 Alex
 On 29 May 2015 4:38 pm, Aman Tandon amantandon...@gmail.com wrote:

  Hi,
 
  I want to debug the whole indexing process, the life cycle of indexing
  process (each and every function call by going via function to function),
  from the posting of the data.xml to creation of various index files (
 _fnm,
  _fdt, etc ). So how/what should I setup and start, please help. I will be
  thankful to you.
 
 
 
  
  
   *add  doc  field name=title![CDATA[Aman Tandon]]/field
 field name=job_role![CDATA[Search Engineer]]/field*
 
 
   *  /doc/add*
 
 
  With Regards
  Aman Tandon

Re: docValues: Can we apply synonym

2015-05-28 Thread Aman Tandon

Thanks chris.

Yes we are using it for handling multiword synonym problem.

With Regards
Aman Tandon

On Fri, May 29, 2015 at 12:38 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 Again, I would recommend using Nolan Lawson's
 SynonymExpandingExtendedDismaxQParserPlugin.

 http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

 -Original Message-
 From: Aman Tandon [mailto:amantandon...@gmail.com]
 Sent: Wednesday, May 27, 2015 6:42 PM
 To: solr-user@lucene.apache.org
 Subject: Re: docValues: Can we apply synonym

 Ok and what synonym processor you is talking about maybe it could help ?

 With Regards
 Aman Tandon

 On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  Sorry, my bad.   The synonym processor I mention works differently.  It's
  an extension of the EDisMax query processor and doesn't require field
  level synonym configs.
 
  -Original Message-
  From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org]
  Sent: Wednesday, May 27, 2015 6:12 PM
  To: solr-user@lucene.apache.org
  Subject: RE: docValues: Can we apply synonym
 
  But the query analysis isn't on a specific field, it is applied to the
  query string.
 
  -Original Message-
  From: Aman Tandon [mailto:amantandon...@gmail.com]
  Sent: Wednesday, May 27, 2015 6:08 PM
  To: solr-user@lucene.apache.org
  Subject: Re: docValues: Can we apply synonym
 
  Hi Charles,
 
  The problem here is that the docValues works only with primitives data
  type only like String, int, etc So how could we apply synonym on
  primitive data type.
 
  With Regards
  Aman Tandon
 
  On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles 
  charles.reit...@tiaa-cref.org wrote:
 
   Is there any reason you cannot apply the synonyms at query time?
Applying synonyms at indexing time has problems, e.g. polluting the
   term frequency for synonyms added, preventing distance queries, ...
  
   Since city names often have multiple terms, e.g. New York, Den
   Hague, etc., I would recommend using Nolan Lawson's
   SynonymExpandingExtendedDismaxQParserPlugin.   Tastes great, less
  filling.
  
   http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
  
   We found this to fix synonyms like ny for New York and vice versa.
   Haven't tried it with docValues, tho.
  
   -Original Message-
   From: Aman Tandon [mailto:amantandon...@gmail.com]
   Sent: Tuesday, May 26, 2015 11:15 PM
   To: solr-user@lucene.apache.org
   Subject: Re: docValues: Can we apply synonym
  
   Yes it could be :)
  
   Anyway thanks for helping.
  
   With Regards
   Aman Tandon
  
   On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
   benedetti.ale...@gmail.com wrote:
  
I should investigate that, as usually synonyms are analysis stage.
A simple way is to replace the word with all its synonyms (
including original word), but simply using this kind of processor
will change the token position and offsets, modifying the actual
content of the
   document .
   
 I am from Bombay will become  I am from Bombay Mumbai which
can be annoying.
So a clever approach must be investigated.
   
2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com:
   
 Okay So how could I do it with UpdateProcessors?

 With Regards
 Aman Tandon

 On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  mmm this is different !
  Without any customisation, right now you could :
  - use docValues to provide exact value facets.
  - Than you can use a copy field, with the proper analysis, to
  search
 when a
  user click on a filter !
 
  So you will see in your facets :
  Mumbai(3)
  Bombay(2)
 
  And when clicking you see 5 results.
  A little bit misleading for the users …
 
  On the other hand if you you want to apply the synonyms
  before, the indexing pipeline ( because docValues field can
  not be analysed), I
think
  you should play with UpdateProcessors.
 
  Cheers
 
  2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com
 :
 
   We are interested in using docValues for better memory
   utilization
and
   speed.
  
   Currently we are faceting the search results on *city. *In
   city we
have
   also added the synonym for cities like mumbai, bombay (These
   are
Indian
   cities). So that result of mumbai is also eligible when
   somebody will applying filter of bombay on search results.
  
   I need this functionality to apply with docValues enabled
 field.
  
   With Regards
   Aman Tandon
  
   On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
   benedetti.ale...@gmail.com wrote:
  
I checked in the Documentation to be sure, but apparently :
   
DocValues are only available

Re: SolrCloud: Creating more shard at runtime will lower down the load?

2015-05-28 Thread Aman Tandon

Thank you Alessandro.

With Regards
Aman Tandon

On Thu, May 28, 2015 at 3:57 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 Hi Aman,
 this feature can be interesting for you :

  Shard Splitting
 
  When you create a collection in SolrCloud, you decide on the initial
  number shards to be used. But it can be difficult to know in advance the
  number of shards that you need, particularly when organizational
  requirements can change at a moment's notice, and the cost of finding out
  later that you chose wrong can be high, involving creating new cores and
  re-indexing all of your data.
 
  The ability to split shards is in the Collections API. It currently
 allows
  splitting a shard into two pieces. The existing shard is left as-is, so
 the
  split action effectively makes two copies of the data as new shards. You
  can delete the old shard at a later time when you're ready.
 
  More details on how to use shard splitting is in the section on the
 Collections
  API https://cwiki.apache.org/confluence/display/solr/Collections+API.
 

 To answer to your questions :

 1) If your shard is properly splitter, and you use Solr Cloud to distribute
 the requests and load balancing, the users will not see anything
 2) Of course it is but you must be careful, because maybe you want to add
 replicas if the amount of load is your concern.

 Usually sharing is because an increasing amount of content to process and
 search.
 Adding replication is because an increasing demand of queries and high load
 for the servers.

 Let me know more details if you like !

 Cheers

 2015-05-28 4:44 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Hi,
 
  I have a question regarding the solr cloud. The load on our search server
  are increasing day by day as our no of visitors are keep on increasing.
 
  So I have a scenario, I  want to slice the data at the Runtime, by
 creating
  the more shards of the data.
 
  *i)* Does it affect the current queries
  *ii)*  Does it lower down the load on our search servers?
 
  With Regards
  Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Guidance needed to modify ExtendedDismaxQParserPlugin

2015-05-28 Thread Aman Tandon

Hi,

*Problem Statement: *query - i need leather jute bags

If we are searching on the *title *field using the pf2 (
*server:8003/solr/core0/select?q=i%20need%20leather%20jute%20bagspf2=titlexdebug=querydefType=edismaxwt=xmlrows=0*).
Currently it will create the shingled phrases like i need, need
leather, leather jute, jute bags.



 *str name=parsedquery_toString+(((title:i)~0.01 (title:need)~0.01
 (title:leather)~0.01 (title:jute)~0.01 (title:bag)~0.01)~3) ((titlex:i
 need)~0.01 (titlex:need leather)~0.01 (titlex:leather jute)~0.01
 (titlex:jute bag)~0.01)/str*


*Requirement: *

I want to customize the ExtendedDismaxQParserPlugin to generate custom
phrase queries on pf2. I want to create the phrase tokens like jute bags,
leather jute bags

So the irrelevant tokens like *i need*, *need leather* didn't match any
search results. Because in most of the scenarios in our business, we
observed (from Google Analytics) that last two words are more important in
the query.

So I need to generate only these two tokens by calling my xyz function
instead of calling the function *addShingledPhraseQueries. *Please guide me
here.

Should I modify the same java class or create another class. And In case of
another class how and where should I need to define our customized *defType*
.

With Regards
Aman Tandon

SolrCloud: Creating more shard at runtime will lower down the load?

2015-05-27 Thread Aman Tandon

Hi,

I have a question regarding the solr cloud. The load on our search server
are increasing day by day as our no of visitors are keep on increasing.

So I have a scenario, I  want to slice the data at the Runtime, by creating
the more shards of the data.

*i)* Does it affect the current queries
*ii)*  Does it lower down the load on our search servers?

With Regards
Aman Tandon

Re: docValues: Can we apply synonym

2015-05-27 Thread Aman Tandon

Ok and what synonym processor you is talking about maybe it could help ?

With Regards
Aman Tandon

On Thu, May 28, 2015 at 4:01 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 Sorry, my bad.   The synonym processor I mention works differently.  It's
 an extension of the EDisMax query processor and doesn't require field level
 synonym configs.

 -Original Message-
 From: Reitzel, Charles [mailto:charles.reit...@tiaa-cref.org]
 Sent: Wednesday, May 27, 2015 6:12 PM
 To: solr-user@lucene.apache.org
 Subject: RE: docValues: Can we apply synonym

 But the query analysis isn't on a specific field, it is applied to the
 query string.

 -Original Message-
 From: Aman Tandon [mailto:amantandon...@gmail.com]
 Sent: Wednesday, May 27, 2015 6:08 PM
 To: solr-user@lucene.apache.org
 Subject: Re: docValues: Can we apply synonym

 Hi Charles,

 The problem here is that the docValues works only with primitives data
 type only like String, int, etc So how could we apply synonym on primitive
 data type.

 With Regards
 Aman Tandon

 On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  Is there any reason you cannot apply the synonyms at query time?
   Applying synonyms at indexing time has problems, e.g. polluting the
  term frequency for synonyms added, preventing distance queries, ...
 
  Since city names often have multiple terms, e.g. New York, Den Hague,
  etc., I would recommend using Nolan Lawson's
  SynonymExpandingExtendedDismaxQParserPlugin.   Tastes great, less
 filling.
 
  http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/
 
  We found this to fix synonyms like ny for New York and vice versa.
  Haven't tried it with docValues, tho.
 
  -Original Message-
  From: Aman Tandon [mailto:amantandon...@gmail.com]
  Sent: Tuesday, May 26, 2015 11:15 PM
  To: solr-user@lucene.apache.org
  Subject: Re: docValues: Can we apply synonym
 
  Yes it could be :)
 
  Anyway thanks for helping.
 
  With Regards
  Aman Tandon
 
  On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   I should investigate that, as usually synonyms are analysis stage.
   A simple way is to replace the word with all its synonyms (
   including original word), but simply using this kind of processor
   will change the token position and offsets, modifying the actual
   content of the
  document .
  
I am from Bombay will become  I am from Bombay Mumbai which can
   be annoying.
   So a clever approach must be investigated.
  
   2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com:
  
Okay So how could I do it with UpdateProcessors?
   
With Regards
Aman Tandon
   
On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:
   
 mmm this is different !
 Without any customisation, right now you could :
 - use docValues to provide exact value facets.
 - Than you can use a copy field, with the proper analysis, to
 search
when a
 user click on a filter !

 So you will see in your facets :
 Mumbai(3)
 Bombay(2)

 And when clicking you see 5 results.
 A little bit misleading for the users …

 On the other hand if you you want to apply the synonyms before,
 the indexing pipeline ( because docValues field can not be
 analysed), I
   think
 you should play with UpdateProcessors.

 Cheers

 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  We are interested in using docValues for better memory
  utilization
   and
  speed.
 
  Currently we are faceting the search results on *city. *In
  city we
   have
  also added the synonym for cities like mumbai, bombay (These
  are
   Indian
  cities). So that result of mumbai is also eligible when
  somebody will applying filter of bombay on search results.
 
  I need this functionality to apply with docValues enabled field.
 
  With Regards
  Aman Tandon
 
  On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   I checked in the Documentation to be sure, but apparently :
  
   DocValues are only available for specific field types. The
   types
chosen
   determine the underlying Lucene docValue type that will be
 used.
   The
   available Solr field types are:
  
  - StrField and UUIDField.
  - If the field is single-valued (i.e., multi-valued is
   false),
 Lucene
 will use the SORTED type.
 - If the field is multi-valued, Lucene will use the
   SORTED_SET
  type.
  - Any Trie* numeric fields and EnumField.
  - If the field is single-valued (i.e., multi-valued is
   false),
 Lucene
 will use the NUMERIC type.
 - If the field is multi-valued, Lucene will use

Re: docValues: Can we apply synonym

2015-05-27 Thread Aman Tandon

Hi Charles,

The problem here is that the docValues works only with primitives data type
only like String, int, etc So how could we apply synonym on primitive data
type.

With Regards
Aman Tandon

On Thu, May 28, 2015 at 3:19 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 Is there any reason you cannot apply the synonyms at query time?
  Applying synonyms at indexing time has problems, e.g. polluting the term
 frequency for synonyms added, preventing distance queries, ...

 Since city names often have multiple terms, e.g. New York, Den Hague,
 etc., I would recommend using Nolan Lawson's
 SynonymExpandingExtendedDismaxQParserPlugin.   Tastes great, less filling.

 http://nolanlawson.com/2012/10/31/better-synonym-handling-in-solr/

 We found this to fix synonyms like ny for New York and vice versa.
 Haven't tried it with docValues, tho.

 -Original Message-
 From: Aman Tandon [mailto:amantandon...@gmail.com]
 Sent: Tuesday, May 26, 2015 11:15 PM
 To: solr-user@lucene.apache.org
 Subject: Re: docValues: Can we apply synonym

 Yes it could be :)

 Anyway thanks for helping.

 With Regards
 Aman Tandon

 On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  I should investigate that, as usually synonyms are analysis stage.
  A simple way is to replace the word with all its synonyms ( including
  original word), but simply using this kind of processor will change
  the token position and offsets, modifying the actual content of the
 document .
 
   I am from Bombay will become  I am from Bombay Mumbai which can
  be annoying.
  So a clever approach must be investigated.
 
  2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com:
 
   Okay So how could I do it with UpdateProcessors?
  
   With Regards
   Aman Tandon
  
   On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti 
   benedetti.ale...@gmail.com wrote:
  
mmm this is different !
Without any customisation, right now you could :
- use docValues to provide exact value facets.
- Than you can use a copy field, with the proper analysis, to
search
   when a
user click on a filter !
   
So you will see in your facets :
Mumbai(3)
Bombay(2)
   
And when clicking you see 5 results.
A little bit misleading for the users …
   
On the other hand if you you want to apply the synonyms before,
the indexing pipeline ( because docValues field can not be
analysed), I
  think
you should play with UpdateProcessors.
   
Cheers
   
2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com:
   
 We are interested in using docValues for better memory
 utilization
  and
 speed.

 Currently we are faceting the search results on *city. *In city
 we
  have
 also added the synonym for cities like mumbai, bombay (These are
  Indian
 cities). So that result of mumbai is also eligible when somebody
 will applying filter of bombay on search results.

 I need this functionality to apply with docValues enabled field.

 With Regards
 Aman Tandon

 On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
 benedetti.ale...@gmail.com wrote:

  I checked in the Documentation to be sure, but apparently :
 
  DocValues are only available for specific field types. The
  types
   chosen
  determine the underlying Lucene docValue type that will be used.
  The
  available Solr field types are:
 
 - StrField and UUIDField.
 - If the field is single-valued (i.e., multi-valued is
  false),
Lucene
will use the SORTED type.
- If the field is multi-valued, Lucene will use the
  SORTED_SET
 type.
 - Any Trie* numeric fields and EnumField.
 - If the field is single-valued (i.e., multi-valued is
  false),
Lucene
will use the NUMERIC type.
- If the field is multi-valued, Lucene will use the
  SORTED_SET
 type.
 
 
  This means you should not analyse a field where DocValues is
  enabled.
  Can your explain us your use case ? Why are you interested in
   synonyms
  DocValues level ?
 
  Cheers
 
  2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk:
 
   To my understanding, docValues are just an uninverted index.
   That
   is,
 it
   contains the terms that are generated at the end of an
   analysis
chain.
   Therefore, you simply enable docValues and include the
   SynonymFilterFactory in your analysis.
  
   Is that enough, or are you struggling with some other issue?
  
   Upayavira
  
   On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote:
Hi,
   
We have some field *city* in which the docValues are enabled.
  We
need
  to
add the synonym in that field so how could we do it?
   
With Regards
Aman Tandon

Re: docValues: Can we apply synonym

2015-05-26 Thread Aman Tandon

Yes it could be :)

Anyway thanks for helping.

With Regards
Aman Tandon

On Tue, May 26, 2015 at 10:22 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 I should investigate that, as usually synonyms are analysis stage.
 A simple way is to replace the word with all its synonyms ( including
 original word), but simply using this kind of processor will change the
 token position and offsets, modifying the actual content of the document .

  I am from Bombay will become  I am from Bombay Mumbai which can be
 annoying.
 So a clever approach must be investigated.

 2015-05-26 17:36 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  Okay So how could I do it with UpdateProcessors?
 
  With Regards
  Aman Tandon
 
  On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   mmm this is different !
   Without any customisation, right now you could :
   - use docValues to provide exact value facets.
   - Than you can use a copy field, with the proper analysis, to search
  when a
   user click on a filter !
  
   So you will see in your facets :
   Mumbai(3)
   Bombay(2)
  
   And when clicking you see 5 results.
   A little bit misleading for the users …
  
   On the other hand if you you want to apply the synonyms before, the
   indexing pipeline ( because docValues field can not be analysed), I
 think
   you should play with UpdateProcessors.
  
   Cheers
  
   2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com:
  
We are interested in using docValues for better memory utilization
 and
speed.
   
Currently we are faceting the search results on *city. *In city we
 have
also added the synonym for cities like mumbai, bombay (These are
 Indian
cities). So that result of mumbai is also eligible when somebody will
applying filter of bombay on search results.
   
I need this functionality to apply with docValues enabled field.
   
With Regards
Aman Tandon
   
On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:
   
 I checked in the Documentation to be sure, but apparently :

 DocValues are only available for specific field types. The types
  chosen
 determine the underlying Lucene docValue type that will be used.
 The
 available Solr field types are:

- StrField and UUIDField.
- If the field is single-valued (i.e., multi-valued is false),
   Lucene
   will use the SORTED type.
   - If the field is multi-valued, Lucene will use the
 SORTED_SET
type.
- Any Trie* numeric fields and EnumField.
- If the field is single-valued (i.e., multi-valued is false),
   Lucene
   will use the NUMERIC type.
   - If the field is multi-valued, Lucene will use the
 SORTED_SET
type.


 This means you should not analyse a field where DocValues is
 enabled.
 Can your explain us your use case ? Why are you interested in
  synonyms
 DocValues level ?

 Cheers

 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk:

  To my understanding, docValues are just an uninverted index. That
  is,
it
  contains the terms that are generated at the end of an analysis
   chain.
  Therefore, you simply enable docValues and include the
  SynonymFilterFactory in your analysis.
 
  Is that enough, or are you struggling with some other issue?
 
  Upayavira
 
  On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote:
   Hi,
  
   We have some field *city* in which the docValues are enabled.
 We
   need
 to
   add the synonym in that field so how could we do it?
  
   With Regards
   Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

   
  
  
  
   --
   --
  
   Benedetti Alessandro
   Visiting card : http://about.me/alessandro_benedetti
  
   Tyger, tyger burning bright
   In the forests of the night,
   What immortal hand or eye
   Could frame thy fearful symmetry?
  
   William Blake - Songs of Experience -1794 England
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-26 Thread Aman Tandon

Thank you so much Ahmet :)

With Regards
Aman Tandon

On Wed, May 27, 2015 at 1:29 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Aman,

 Start with creating a jira account and vote/watch that issue.
 Post on the issue to see if there is still interest on this.
 Declare that you will be volunteer and ask kindly for guidance.
 Creator of the issue or one the watchers may respond.
 Try to digest ideas discussed on the issue. Rise yours. Collaborate.
 Don't get discouraged if nobody responds, please remember that committers
 are busy people.

 If you have implement something you want to share, upload a patch :
 https://wiki.apache.org/solr/HowToContribute

 Good luck,
 Ahmet



 On Tuesday, May 26, 2015 7:47 PM, Aman Tandon amantandon...@gmail.com
 wrote:
 Hi Ahmet,

 Can you please guide me to contribute for this *issue*. I haven't did this
 before.

 So I need to know...what should I need to know and how should I start..what
 IDE or whatever you thought is need to know for a novice. I will be
 thankful to you :)

 With Regards
 Aman Tandon


 On Tue, May 19, 2015 at 8:10 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  That link you provided is exactly I want to do. Thanks Ahmet.
 
  With Regards
  Aman Tandon
 
  On Tue, May 19, 2015 at 5:06 PM, Ahmet Arslan iori...@yahoo.com.invalid
 
  wrote:
 
  Hi Aman,
 
  changing protected words without reindexing makes little or no sense.
  Regarding protected words, trend is to use
  solr.KeywordMarkerFilterFactory.
 
  Instead I suggest you to work on a more general issue:
  https://issues.apache.org/jira/browse/SOLR-1307
  Ahmet
 
 
  On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
  wrote:
  Please help or I am not clear here?
 
  With Regards
  Aman Tandon
 
 
  On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
   Hi,
  
   *Problem Statement: *I want to reload an hash of protwords created by
  the
   kstem filter without reloading the whole index core.
  
   *My Thought: *I am thinking to reload the hash by passing a parameter
   like *r=1 *to analysis url request (to somehow pass the parameter via
   url). And I am thinking if somehow by changing the IndexSchema.java I
  might
   can pass this parameter though my analyzer chain to KStemFilter. In
  which I
   will call the initializeDictionary function to make protwords hash
 again
   from the file if *r=1*, instead of making full core reload request.
  
   Please guide me, I know question might be stupid, the thought came in
 my
   mind and I want to share and ask some suggestions here. Is it possible
  or
   not and how can i achieve the same?
  
   I will be thankful for guidance.
  
   With Regards
   Aman Tandon
  
 
 
 
 
  On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
  wrote:
  Please help or I am not clear here?
 
  With Regards
  Aman Tandon
 
 
  On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
  wrote:
 
   Hi,
  
   *Problem Statement: *I want to reload an hash of protwords created by
  the
   kstem filter without reloading the whole index core.
  
   *My Thought: *I am thinking to reload the hash by passing a parameter
   like *r=1 *to analysis url request (to somehow pass the parameter via
   url). And I am thinking if somehow by changing the IndexSchema.java I
  might
   can pass this parameter though my analyzer chain to KStemFilter. In
  which I
   will call the initializeDictionary function to make protwords hash
 again
   from the file if *r=1*, instead of making full core reload request.
  
   Please guide me, I know question might be stupid, the thought came in
 my
   mind and I want to share and ask some suggestions here. Is it possible
  or
   not and how can i achieve the same?
  
   I will be thankful for guidance.
  
   With Regards
   Aman Tandon

docValues: Can we apply synonym

2015-05-26 Thread Aman Tandon

Hi,

We have some field *city* in which the docValues are enabled. We need to
add the synonym in that field so how could we do it?

With Regards
Aman Tandon

Re: Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-26 Thread Aman Tandon

Hi Ahmet,

Can you please guide me to contribute for this *issue*. I haven't did this
before.

So I need to know...what should I need to know and how should I start..what
IDE or whatever you thought is need to know for a novice. I will be
thankful to you :)

With Regards
Aman Tandon

On Tue, May 19, 2015 at 8:10 PM, Aman Tandon amantandon...@gmail.com
wrote:

 That link you provided is exactly I want to do. Thanks Ahmet.

 With Regards
 Aman Tandon

 On Tue, May 19, 2015 at 5:06 PM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:

 Hi Aman,

 changing protected words without reindexing makes little or no sense.
 Regarding protected words, trend is to use
 solr.KeywordMarkerFilterFactory.

 Instead I suggest you to work on a more general issue:
 https://issues.apache.org/jira/browse/SOLR-1307
 Ahmet


 On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 Please help or I am not clear here?

 With Regards
 Aman Tandon


 On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi,
 
  *Problem Statement: *I want to reload an hash of protwords created by
 the
  kstem filter without reloading the whole index core.
 
  *My Thought: *I am thinking to reload the hash by passing a parameter
  like *r=1 *to analysis url request (to somehow pass the parameter via
  url). And I am thinking if somehow by changing the IndexSchema.java I
 might
  can pass this parameter though my analyzer chain to KStemFilter. In
 which I
  will call the initializeDictionary function to make protwords hash again
  from the file if *r=1*, instead of making full core reload request.
 
  Please guide me, I know question might be stupid, the thought came in my
  mind and I want to share and ask some suggestions here. Is it possible
 or
  not and how can i achieve the same?
 
  I will be thankful for guidance.
 
  With Regards
  Aman Tandon
 




 On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 Please help or I am not clear here?

 With Regards
 Aman Tandon


 On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi,
 
  *Problem Statement: *I want to reload an hash of protwords created by
 the
  kstem filter without reloading the whole index core.
 
  *My Thought: *I am thinking to reload the hash by passing a parameter
  like *r=1 *to analysis url request (to somehow pass the parameter via
  url). And I am thinking if somehow by changing the IndexSchema.java I
 might
  can pass this parameter though my analyzer chain to KStemFilter. In
 which I
  will call the initializeDictionary function to make protwords hash again
  from the file if *r=1*, instead of making full core reload request.
 
  Please guide me, I know question might be stupid, the thought came in my
  mind and I want to share and ask some suggestions here. Is it possible
 or
  not and how can i achieve the same?
 
  I will be thankful for guidance.
 
  With Regards
  Aman Tandon

Re: docValues: Can we apply synonym

2015-05-26 Thread Aman Tandon

Okay So how could I do it with UpdateProcessors?

With Regards
Aman Tandon

On Tue, May 26, 2015 at 10:00 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 mmm this is different !
 Without any customisation, right now you could :
 - use docValues to provide exact value facets.
 - Than you can use a copy field, with the proper analysis, to search when a
 user click on a filter !

 So you will see in your facets :
 Mumbai(3)
 Bombay(2)

 And when clicking you see 5 results.
 A little bit misleading for the users …

 On the other hand if you you want to apply the synonyms before, the
 indexing pipeline ( because docValues field can not be analysed), I think
 you should play with UpdateProcessors.

 Cheers

 2015-05-26 17:18 GMT+01:00 Aman Tandon amantandon...@gmail.com:

  We are interested in using docValues for better memory utilization and
  speed.
 
  Currently we are faceting the search results on *city. *In city we have
  also added the synonym for cities like mumbai, bombay (These are Indian
  cities). So that result of mumbai is also eligible when somebody will
  applying filter of bombay on search results.
 
  I need this functionality to apply with docValues enabled field.
 
  With Regards
  Aman Tandon
 
  On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
  benedetti.ale...@gmail.com wrote:
 
   I checked in the Documentation to be sure, but apparently :
  
   DocValues are only available for specific field types. The types chosen
   determine the underlying Lucene docValue type that will be used. The
   available Solr field types are:
  
  - StrField and UUIDField.
  - If the field is single-valued (i.e., multi-valued is false),
 Lucene
 will use the SORTED type.
 - If the field is multi-valued, Lucene will use the SORTED_SET
  type.
  - Any Trie* numeric fields and EnumField.
  - If the field is single-valued (i.e., multi-valued is false),
 Lucene
 will use the NUMERIC type.
 - If the field is multi-valued, Lucene will use the SORTED_SET
  type.
  
  
   This means you should not analyse a field where DocValues is enabled.
   Can your explain us your use case ? Why are you interested in synonyms
   DocValues level ?
  
   Cheers
  
   2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk:
  
To my understanding, docValues are just an uninverted index. That is,
  it
contains the terms that are generated at the end of an analysis
 chain.
Therefore, you simply enable docValues and include the
SynonymFilterFactory in your analysis.
   
Is that enough, or are you struggling with some other issue?
   
Upayavira
   
On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote:
 Hi,

 We have some field *city* in which the docValues are enabled. We
 need
   to
 add the synonym in that field so how could we do it?

 With Regards
 Aman Tandon
   
  
  
  
   --
   --
  
   Benedetti Alessandro
   Visiting card : http://about.me/alessandro_benedetti
  
   Tyger, tyger burning bright
   In the forests of the night,
   What immortal hand or eye
   Could frame thy fearful symmetry?
  
   William Blake - Songs of Experience -1794 England
  
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: docValues: Can we apply synonym

2015-05-26 Thread Aman Tandon

We are interested in using docValues for better memory utilization and
speed.

Currently we are faceting the search results on *city. *In city we have
also added the synonym for cities like mumbai, bombay (These are Indian
cities). So that result of mumbai is also eligible when somebody will
applying filter of bombay on search results.

I need this functionality to apply with docValues enabled field.

With Regards
Aman Tandon

On Tue, May 26, 2015 at 9:19 PM, Alessandro Benedetti 
benedetti.ale...@gmail.com wrote:

 I checked in the Documentation to be sure, but apparently :

 DocValues are only available for specific field types. The types chosen
 determine the underlying Lucene docValue type that will be used. The
 available Solr field types are:

- StrField and UUIDField.
- If the field is single-valued (i.e., multi-valued is false), Lucene
   will use the SORTED type.
   - If the field is multi-valued, Lucene will use the SORTED_SET type.
- Any Trie* numeric fields and EnumField.
- If the field is single-valued (i.e., multi-valued is false), Lucene
   will use the NUMERIC type.
   - If the field is multi-valued, Lucene will use the SORTED_SET type.


 This means you should not analyse a field where DocValues is enabled.
 Can your explain us your use case ? Why are you interested in synonyms
 DocValues level ?

 Cheers

 2015-05-26 13:32 GMT+01:00 Upayavira u...@odoko.co.uk:

  To my understanding, docValues are just an uninverted index. That is, it
  contains the terms that are generated at the end of an analysis chain.
  Therefore, you simply enable docValues and include the
  SynonymFilterFactory in your analysis.
 
  Is that enough, or are you struggling with some other issue?
 
  Upayavira
 
  On Tue, May 26, 2015, at 12:03 PM, Aman Tandon wrote:
   Hi,
  
   We have some field *city* in which the docValues are enabled. We need
 to
   add the synonym in that field so how could we do it?
  
   With Regards
   Aman Tandon
 



 --
 --

 Benedetti Alessandro
 Visiting card : http://about.me/alessandro_benedetti

 Tyger, tyger burning bright
 In the forests of the night,
 What immortal hand or eye
 Could frame thy fearful symmetry?

 William Blake - Songs of Experience -1794 England

Re: Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-19 Thread Aman Tandon

That link you provided is exactly I want to do. Thanks Ahmet.

With Regards
Aman Tandon

On Tue, May 19, 2015 at 5:06 PM, Ahmet Arslan iori...@yahoo.com.invalid
wrote:

 Hi Aman,

 changing protected words without reindexing makes little or no sense.
 Regarding protected words, trend is to use solr.KeywordMarkerFilterFactory.

 Instead I suggest you to work on a more general issue:
 https://issues.apache.org/jira/browse/SOLR-1307
 Ahmet


 On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 Please help or I am not clear here?

 With Regards
 Aman Tandon


 On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi,
 
  *Problem Statement: *I want to reload an hash of protwords created by the
  kstem filter without reloading the whole index core.
 
  *My Thought: *I am thinking to reload the hash by passing a parameter
  like *r=1 *to analysis url request (to somehow pass the parameter via
  url). And I am thinking if somehow by changing the IndexSchema.java I
 might
  can pass this parameter though my analyzer chain to KStemFilter. In
 which I
  will call the initializeDictionary function to make protwords hash again
  from the file if *r=1*, instead of making full core reload request.
 
  Please guide me, I know question might be stupid, the thought came in my
  mind and I want to share and ask some suggestions here. Is it possible or
  not and how can i achieve the same?
 
  I will be thankful for guidance.
 
  With Regards
  Aman Tandon
 




 On Tuesday, May 19, 2015 3:16 AM, Aman Tandon amantandon...@gmail.com
 wrote:
 Please help or I am not clear here?

 With Regards
 Aman Tandon


 On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
 wrote:

  Hi,
 
  *Problem Statement: *I want to reload an hash of protwords created by the
  kstem filter without reloading the whole index core.
 
  *My Thought: *I am thinking to reload the hash by passing a parameter
  like *r=1 *to analysis url request (to somehow pass the parameter via
  url). And I am thinking if somehow by changing the IndexSchema.java I
 might
  can pass this parameter though my analyzer chain to KStemFilter. In
 which I
  will call the initializeDictionary function to make protwords hash again
  from the file if *r=1*, instead of making full core reload request.
 
  Please guide me, I know question might be stupid, the thought came in my
  mind and I want to share and ask some suggestions here. Is it possible or
  not and how can i achieve the same?
 
  I will be thankful for guidance.
 
  With Regards
  Aman Tandon

Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-18 Thread Aman Tandon

Hi,

*Problem Statement: *I want to reload an hash of protwords created by the
kstem filter without reloading the whole index core.

*My Thought: *I am thinking to reload the hash by passing a parameter
like *r=1
*to analysis url request (to somehow pass the parameter via url). And I am
thinking if somehow by changing the IndexSchema.java I might can pass this
parameter though my analyzer chain to KStemFilter. In which I will call the
initializeDictionary function to make protwords hash again from the file if
*r=1*, instead of making full core reload request.

Please guide me, I know question might be stupid, the thought came in my
mind and I want to share and ask some suggestions here. Is it possible or
not and how can i achieve the same?

I will be thankful for guidance.

With Regards
Aman Tandon

Re: Help/Guidance Needed : To reload kstem protword hash without full core reload

2015-05-18 Thread Aman Tandon

Please help or I am not clear here?

With Regards
Aman Tandon

On Mon, May 18, 2015 at 9:47 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi,

 *Problem Statement: *I want to reload an hash of protwords created by the
 kstem filter without reloading the whole index core.

 *My Thought: *I am thinking to reload the hash by passing a parameter
 like *r=1 *to analysis url request (to somehow pass the parameter via
 url). And I am thinking if somehow by changing the IndexSchema.java I might
 can pass this parameter though my analyzer chain to KStemFilter. In which I
 will call the initializeDictionary function to make protwords hash again
 from the file if *r=1*, instead of making full core reload request.

 Please guide me, I know question might be stupid, the thought came in my
 mind and I want to share and ask some suggestions here. Is it possible or
 not and how can i achieve the same?

 I will be thankful for guidance.

 With Regards
 Aman Tandon

Re: Searcher is opening twice on Reload

2015-05-15 Thread Aman Tandon

Thanks chris, but in the issue it is mentioned that first searcher listener
is opening twice but in my case firstly the firstSearcher is opening and
then newSearcher. Is it same?

With Regards
Aman Tandon

On Thu, May 14, 2015 at 11:05 PM, Chris Hostetter hossman_luc...@fucit.org
wrote:


 I suspect you aren't doing anything wrong, i think it's the same as this
 bug...

 https://issues.apache.org/jira/browse/SOLR-7035


 : Date: Thu, 14 May 2015 12:53:34 +0530
 : From: Aman Tandon amantandon...@gmail.com
 : Reply-To: solr-user@lucene.apache.org
 : To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 : Subject: Searcher is opening twice on Reload
 :
 : Hi,
 :
 : Please help me here, when I am doing the reload of core, my searcher is
 : being opening twice. I am also attaching the logs output, please suggest
 me
 : what wrong I am doing here or this is default behavior on reload.
 :
 : May 14, 2015 12:47:38 PM org.apache.solr.spelling.DirectSolrSpellChecker
 :  INFO: init:
 : 
 {name=default,field=titlews,classname=solr.DirectSolrSpellChecker,distanceMeasure=internal,accuracy=0.5,maxEdits=1,minPrefix=1,maxInspections=5,minQueryLength=5,maxQueryFrequency=100.0,thresholdTokenFrequency=100.0}
 :  May 14, 2015 12:47:38 PM
 :  org.apache.solr.handler.component.SpellCheckComponent
 :  INFO: No queryConverter defined, using default converter
 :  May 14, 2015 12:47:38 PM
 :  org.apache.solr.handler.component.QueryElevationComponent
 :  INFO: Loading QueryElevation from data dir: elevate.xml
 :  May 14, 2015 12:47:38 PM org.apache.solr.handler.ReplicationHandler
 :  INFO: Commits will be reserved for  1
 :  May 14, 2015 12:47:38 PM org.apache.solr.core.QuerySenderListener
 :  INFO: QuerySenderListener sending requests to Searcher@41dc3c83
 [IM-Search]
 :  main{StandardDirectoryReader(segments_dd4:82296:nrt
 :  _jdq(4.8):C5602938/2310052:delGen=3132
 :  _jkq(4.8):C6860454/1398005:delGen=2992
 :  _jx2(4.8):C5237053/1505048:delGen=3241
 :  _joo(4.8):C5825253/1599671:delGen=3323
 :  _k4d(4.8):C5860360/1916531:delGen=3150
 :  _o27(4.8):C5290435/1018865:delGen=370
 :  _mju(4.8):C5074973/1602707:delGen=1474
 :  _jka(4.8):C5172599/1774839:delGen=3202
 :  _nik(4.8):C4698916/1512091:delGen=804
 _o8y(4.8):C1137592/521423:delGen=190
 :  _oeu(4.8):C469094/86291:delGen=29 _odq(4.8):C217505/65596:delGen=55
 :  _ogd(4.8):C50454/4155:delGen=5 _oea(4.8):C40833/7192:delGen=37
 :  _ofy(4.8):C73614/7273:delGen=13 _ogx(4.8):C395681/1388:delGen=4
 :  _ogh(4.8):C7676/70:delGen=2 _ohf(4.8):C108769/21:delGen=2
 :  _ogc(4.8):C24435/384:delGen=4 _ogi(4.8):C23088/158:delGen=3
 :  _ogj(4.8):C4217/2:delGen=1 _ohs(4.8):C7 _oh6(4.8):C20509/205:delGen=5
 :  _oh7(4.8):C3171 _oho(4.8):C6/1:delGen=1 _ohq(4.8):C1
 :  _ohv(4.8):C10484/996:delGen=2 _ohx(4.8):C500 _ohy(4.8):C1
 _ohz(4.8):C1)}
 :  ^[OFMay 14, 2015 12:47:43 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=/solr path=/select
 : 
 params={spellcheck=truelon=0q=qwt=jsonqt=opsview.monitorlat=0rows=0ps=1}
 :  hits=6 status=0 QTime=1
 :  May 14, 2015 12:47:44 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.search.intentrows=25}
 :  hits=42749 status=0 QTime=5667
 :  May 14, 2015 12:47:58 PM org.apache.solr.request.UnInvertedField
 :  INFO: UnInverted multi-valued field
 : 
 {field=city,memSize=209216385,tindexSize=11029,time=3904,phase1=3783,nTerms=77614,bigTerms=3,termInstances=31291566,uses=0}
 :  May 14, 2015 12:48:01 PM org.apache.solr.request.UnInvertedField
 :  INFO: UnInverted multi-valued field
 : 
 {field=biztype,memSize=208847178,tindexSize=40,time=1318,phase1=1193,nTerms=9,bigTerms=4,termInstances=1607459,uses=0}
 :  May 14, 2015 12:48:01 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.searchrows=25}
 :  hits=57619 status=0 QTime=17194
 :  May 14, 2015 12:48:04 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=potassium+cyanidedistrib=falseqt=eto.search.offerrows=20}
 :  hits=443 status=0 QTime=3272
 :  May 14, 2015 12:48:09 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=motor+spare+partsdistrib=falseqt=im.searchfq=attribs:(locprefglobal+locprefnational+locprefcity)rows=20}
 :  hits=107297 status=0 QTime=5254
 :  May 14, 2015 12:48:09 PM org.apache.solr.core.QuerySenderListener
 :  INFO: QuerySenderListener done.
 :  May 14, 2015 12:48:09 PM
 : 
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 :  INFO: Loading spell index for spellchecker: default
 :  May 14, 2015 12:48:09 PM
 : 
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 :  INFO: Loading spell index for spellchecker: wordbreak
 :  May 14, 2015 12:48:09 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] Registered new

Re: Searcher is opening twice on Reload

2015-05-15 Thread Aman Tandon

Any help here..

With Regards
Aman Tandon

On Fri, May 15, 2015 at 1:24 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Thanks chris, but in the issue it is mentioned that first searcher
 listener is opening twice but in my case firstly the firstSearcher is
 opening and then newSearcher. Is it same?

 With Regards
 Aman Tandon

 On Thu, May 14, 2015 at 11:05 PM, Chris Hostetter 
 hossman_luc...@fucit.org wrote:


 I suspect you aren't doing anything wrong, i think it's the same as this
 bug...

 https://issues.apache.org/jira/browse/SOLR-7035


 : Date: Thu, 14 May 2015 12:53:34 +0530
 : From: Aman Tandon amantandon...@gmail.com
 : Reply-To: solr-user@lucene.apache.org
 : To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 : Subject: Searcher is opening twice on Reload
 :
 : Hi,
 :
 : Please help me here, when I am doing the reload of core, my searcher is
 : being opening twice. I am also attaching the logs output, please
 suggest me
 : what wrong I am doing here or this is default behavior on reload.
 :
 : May 14, 2015 12:47:38 PM org.apache.solr.spelling.DirectSolrSpellChecker
 :  INFO: init:
 : 
 {name=default,field=titlews,classname=solr.DirectSolrSpellChecker,distanceMeasure=internal,accuracy=0.5,maxEdits=1,minPrefix=1,maxInspections=5,minQueryLength=5,maxQueryFrequency=100.0,thresholdTokenFrequency=100.0}
 :  May 14, 2015 12:47:38 PM
 :  org.apache.solr.handler.component.SpellCheckComponent
 :  INFO: No queryConverter defined, using default converter
 :  May 14, 2015 12:47:38 PM
 :  org.apache.solr.handler.component.QueryElevationComponent
 :  INFO: Loading QueryElevation from data dir: elevate.xml
 :  May 14, 2015 12:47:38 PM org.apache.solr.handler.ReplicationHandler
 :  INFO: Commits will be reserved for  1
 :  May 14, 2015 12:47:38 PM org.apache.solr.core.QuerySenderListener
 :  INFO: QuerySenderListener sending requests to Searcher@41dc3c83
 [IM-Search]
 :  main{StandardDirectoryReader(segments_dd4:82296:nrt
 :  _jdq(4.8):C5602938/2310052:delGen=3132
 :  _jkq(4.8):C6860454/1398005:delGen=2992
 :  _jx2(4.8):C5237053/1505048:delGen=3241
 :  _joo(4.8):C5825253/1599671:delGen=3323
 :  _k4d(4.8):C5860360/1916531:delGen=3150
 :  _o27(4.8):C5290435/1018865:delGen=370
 :  _mju(4.8):C5074973/1602707:delGen=1474
 :  _jka(4.8):C5172599/1774839:delGen=3202
 :  _nik(4.8):C4698916/1512091:delGen=804
 _o8y(4.8):C1137592/521423:delGen=190
 :  _oeu(4.8):C469094/86291:delGen=29 _odq(4.8):C217505/65596:delGen=55
 :  _ogd(4.8):C50454/4155:delGen=5 _oea(4.8):C40833/7192:delGen=37
 :  _ofy(4.8):C73614/7273:delGen=13 _ogx(4.8):C395681/1388:delGen=4
 :  _ogh(4.8):C7676/70:delGen=2 _ohf(4.8):C108769/21:delGen=2
 :  _ogc(4.8):C24435/384:delGen=4 _ogi(4.8):C23088/158:delGen=3
 :  _ogj(4.8):C4217/2:delGen=1 _ohs(4.8):C7 _oh6(4.8):C20509/205:delGen=5
 :  _oh7(4.8):C3171 _oho(4.8):C6/1:delGen=1 _ohq(4.8):C1
 :  _ohv(4.8):C10484/996:delGen=2 _ohx(4.8):C500 _ohy(4.8):C1
 _ohz(4.8):C1)}
 :  ^[OFMay 14, 2015 12:47:43 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=/solr path=/select
 : 
 params={spellcheck=truelon=0q=qwt=jsonqt=opsview.monitorlat=0rows=0ps=1}
 :  hits=6 status=0 QTime=1
 :  May 14, 2015 12:47:44 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.search.intentrows=25}
 :  hits=42749 status=0 QTime=5667
 :  May 14, 2015 12:47:58 PM org.apache.solr.request.UnInvertedField
 :  INFO: UnInverted multi-valued field
 : 
 {field=city,memSize=209216385,tindexSize=11029,time=3904,phase1=3783,nTerms=77614,bigTerms=3,termInstances=31291566,uses=0}
 :  May 14, 2015 12:48:01 PM org.apache.solr.request.UnInvertedField
 :  INFO: UnInverted multi-valued field
 : 
 {field=biztype,memSize=208847178,tindexSize=40,time=1318,phase1=1193,nTerms=9,bigTerms=4,termInstances=1607459,uses=0}
 :  May 14, 2015 12:48:01 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.searchrows=25}
 :  hits=57619 status=0 QTime=17194
 :  May 14, 2015 12:48:04 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=potassium+cyanidedistrib=falseqt=eto.search.offerrows=20}
 :  hits=443 status=0 QTime=3272
 :  May 14, 2015 12:48:09 PM org.apache.solr.core.SolrCore
 :  INFO: [IM-Search] webapp=null path=null
 : 
 params={start=0event=firstSearcherq=motor+spare+partsdistrib=falseqt=im.searchfq=attribs:(locprefglobal+locprefnational+locprefcity)rows=20}
 :  hits=107297 status=0 QTime=5254
 :  May 14, 2015 12:48:09 PM org.apache.solr.core.QuerySenderListener
 :  INFO: QuerySenderListener done.
 :  May 14, 2015 12:48:09 PM
 : 
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 :  INFO: Loading spell index for spellchecker: default
 :  May 14, 2015 12:48:09 PM
 : 
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 :  INFO: Loading spell index

Re: Searcher is opening twice on Reload

2015-05-14 Thread Aman Tandon

Please help.

The solr version is 4.8.1

With Regards
Aman Tandon

On Thu, May 14, 2015 at 12:53 PM, Aman Tandon amantandon...@gmail.com
wrote:

 Hi,

 Please help me here, when I am doing the reload of core, my searcher is
 being opening twice. I am also attaching the logs output, please suggest me
 what wrong I am doing here or this is default behavior on reload.

 May 14, 2015 12:47:38 PM org.apache.solr.spelling.DirectSolrSpellChecker
 INFO: init:
 {name=default,field=titlews,classname=solr.DirectSolrSpellChecker,distanceMeasure=internal,accuracy=0.5,maxEdits=1,minPrefix=1,maxInspections=5,minQueryLength=5,maxQueryFrequency=100.0,thresholdTokenFrequency=100.0}
 May 14, 2015 12:47:38 PM
 org.apache.solr.handler.component.SpellCheckComponent
 INFO: No queryConverter defined, using default converter
 May 14, 2015 12:47:38 PM
 org.apache.solr.handler.component.QueryElevationComponent
 INFO: Loading QueryElevation from data dir: elevate.xml
 May 14, 2015 12:47:38 PM org.apache.solr.handler.ReplicationHandler
 INFO: Commits will be reserved for  1
 May 14, 2015 12:47:38 PM org.apache.solr.core.QuerySenderListener
 INFO: QuerySenderListener sending requests to Searcher@41dc3c83[IM-Search]
 main{StandardDirectoryReader(segments_dd4:82296:nrt
 _jdq(4.8):C5602938/2310052:delGen=3132
 _jkq(4.8):C6860454/1398005:delGen=2992
 _jx2(4.8):C5237053/1505048:delGen=3241
 _joo(4.8):C5825253/1599671:delGen=3323
 _k4d(4.8):C5860360/1916531:delGen=3150
 _o27(4.8):C5290435/1018865:delGen=370
 _mju(4.8):C5074973/1602707:delGen=1474
 _jka(4.8):C5172599/1774839:delGen=3202
 _nik(4.8):C4698916/1512091:delGen=804 _o8y(4.8):C1137592/521423:delGen=190
 _oeu(4.8):C469094/86291:delGen=29 _odq(4.8):C217505/65596:delGen=55
 _ogd(4.8):C50454/4155:delGen=5 _oea(4.8):C40833/7192:delGen=37
 _ofy(4.8):C73614/7273:delGen=13 _ogx(4.8):C395681/1388:delGen=4
 _ogh(4.8):C7676/70:delGen=2 _ohf(4.8):C108769/21:delGen=2
 _ogc(4.8):C24435/384:delGen=4 _ogi(4.8):C23088/158:delGen=3
 _ogj(4.8):C4217/2:delGen=1 _ohs(4.8):C7 _oh6(4.8):C20509/205:delGen=5
 _oh7(4.8):C3171 _oho(4.8):C6/1:delGen=1 _ohq(4.8):C1
 _ohv(4.8):C10484/996:delGen=2 _ohx(4.8):C500 _ohy(4.8):C1 _ohz(4.8):C1)}
 ^[OFMay 14, 2015 12:47:43 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] webapp=/solr path=/select
 params={spellcheck=truelon=0q=qwt=jsonqt=opsview.monitorlat=0rows=0ps=1}
 hits=6 status=0 QTime=1
 May 14, 2015 12:47:44 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] webapp=null path=null
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.search.intentrows=25}
 hits=42749 status=0 QTime=5667
 May 14, 2015 12:47:58 PM org.apache.solr.request.UnInvertedField
 INFO: UnInverted multi-valued field
 {field=city,memSize=209216385,tindexSize=11029,time=3904,phase1=3783,nTerms=77614,bigTerms=3,termInstances=31291566,uses=0}
 May 14, 2015 12:48:01 PM org.apache.solr.request.UnInvertedField
 INFO: UnInverted multi-valued field
 {field=biztype,memSize=208847178,tindexSize=40,time=1318,phase1=1193,nTerms=9,bigTerms=4,termInstances=1607459,uses=0}
 May 14, 2015 12:48:01 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] webapp=null path=null
 params={start=0event=firstSearcherq=ricedistrib=falseqt=im.searchrows=25}
 hits=57619 status=0 QTime=17194
 May 14, 2015 12:48:04 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] webapp=null path=null
 params={start=0event=firstSearcherq=potassium+cyanidedistrib=falseqt=eto.search.offerrows=20}
 hits=443 status=0 QTime=3272
 May 14, 2015 12:48:09 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] webapp=null path=null
 params={start=0event=firstSearcherq=motor+spare+partsdistrib=falseqt=im.searchfq=attribs:(locprefglobal+locprefnational+locprefcity)rows=20}
 hits=107297 status=0 QTime=5254
 May 14, 2015 12:48:09 PM org.apache.solr.core.QuerySenderListener
 INFO: QuerySenderListener done.
 May 14, 2015 12:48:09 PM
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 INFO: Loading spell index for spellchecker: default
 May 14, 2015 12:48:09 PM
 org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener
 INFO: Loading spell index for spellchecker: wordbreak
 May 14, 2015 12:48:09 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] Registered new searcher Searcher@41dc3c83[IM-Search]
 main{StandardDirectoryReader(segments_dd4:82296:nrt
 _jdq(4.8):C5602938/2310052:delGen=3132
 _jkq(4.8):C6860454/1398005:delGen=2992
 _jx2(4.8):C5237053/1505048:delGen=3241
 _joo(4.8):C5825253/1599671:delGen=3323
 _k4d(4.8):C5860360/1916531:delGen=3150
 _o27(4.8):C5290435/1018865:delGen=370
 _mju(4.8):C5074973/1602707:delGen=1474
 _jka(4.8):C5172599/1774839:delGen=3202
 _nik(4.8):C4698916/1512091:delGen=804 _o8y(4.8):C1137592/521423:delGen=190
 _oeu(4.8):C469094/86291:delGen=29 _odq(4.8):C217505/65596:delGen=55
 _ogd(4.8):C50454/4155:delGen=5 _oea(4.8):C40833/7192:delGen=37
 _ofy(4.8):C73614/7273:delGen=13 _ogx(4.8):C395681/1388:delGen=4
 _ogh(4.8):C7676/70:delGen=2 _ohf(4.8):C108769/21:delGen=2
 _ogc(4.8

Searcher is opening twice on Reload

2015-05-14 Thread Aman Tandon

:48:53 PM org.apache.solr.core.SolrCore
 INFO: [IM-Search] Registered new searcher Searcher@49093738[IM-Search]
 main{StandardDirectoryReader(segments_dd4:82296:nrt
 _jdq(4.8):C5602938/2310052:delGen=3132
 _jkq(4.8):C6860454/1398005:delGen=2992
 _jx2(4.8):C5237053/1505048:delGen=3241
 _joo(4.8):C5825253/1599671:delGen=3323
 _k4d(4.8):C5860360/1916531:delGen=3150
 _o27(4.8):C5290435/1018865:delGen=370
 _mju(4.8):C5074973/1602707:delGen=1474
 _jka(4.8):C5172599/1774839:delGen=3202
 _nik(4.8):C4698916/1512091:delGen=804 _o8y(4.8):C1137592/521423:delGen=190
 _oeu(4.8):C469094/86291:delGen=29 _odq(4.8):C217505/65596:delGen=55
 _ogd(4.8):C50454/4155:delGen=5 _oea(4.8):C40833/7192:delGen=37
 _ofy(4.8):C73614/7273:delGen=13 _ogx(4.8):C395681/1388:delGen=4
 _ogh(4.8):C7676/70:delGen=2 _ohf(4.8):C108769/21:delGen=2
 _ogc(4.8):C24435/384:delGen=4 _ogi(4.8):C23088/158:delGen=3
 _ogj(4.8):C4217/2:delGen=1 _ohs(4.8):C7 _oh6(4.8):C20509/205:delGen=5
 _oh7(4.8):C3171 _oho(4.8):C6/1:delGen=1 _ohq(4.8):C1
 _ohv(4.8):C10484/996:delGen=2 _ohx(4.8):C500 _ohy(4.8):C1 _ohz(4.8):C1)}


With Regards
Aman Tandon

1 2 3 >

1 - 100 of 296 matches

Mail list logo