Solrcloud Relication issue

2012-12-23 Thread hupadhyay
Hi all,
I am using *solrcloud, 2 shard 2 replica* configuration.
I am importing data into solr index using jdbc data source and solr DIH.My
problem is when there are only shard nodes are up , no replica is up and i
am importing data into solr  and then i start replica nodes to replicate the
index. This works fine and data seems to be accurate on all nodes.

But when i have all the nodes up(shard and replica) and i am importing data,
at least one replica node from each shard is out of sync, and few k records
are missing on that node,even solrcloud UI screen also displays old index
flag for these nodes.

I don't understand why this happens, is it because i am importing 100k
records in one shot, and few k records gets skips to be index?

Anyone please can you help me out here?

Thanks



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrcloud-Relication-issue-tp4028927.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multi field query with selective results

2012-12-23 Thread Shawn Heisey

On 12/23/2012 9:09 PM, Lance Norskog wrote:

A thousand pardons! Thunderbird displayed your email as a hijack. Now,
it does not. I really wish everyone's code could be free of bugs, like
my code is :)


If you (or anyone else) has a problem like this again, go into 
thunderbird's config editor and type thread into the search.  Make sure 
that 'mail.strict_threading' and 'mail.correct_threading' are true.  On 
version 17, these both default to true.  I think that in older versions 
this may not have been the default, which would affect you if you've 
been a long-time user and have been upgrading,


Thanks,
Shawn



Re: multi field query with selective results

2012-12-23 Thread Lance Norskog
A thousand pardons! Thunderbird displayed your email as a hijack. Now, 
it does not. I really wish everyone's code could be free of bugs, like 
my code is :)


On 12/23/2012 01:38 AM, J Mohamed Zahoor wrote:

I don't think I hijacked any thread.  it is a new thread. Can you please
enlighten me?

On Sunday, December 23, 2012, Lance Norskog wrote:


Please start a new thread.

Thanks!

On 12/22/2012 11:03 AM, J Mohamed Zahoor wrote:


Hi

I have a word completion requirement where i need to pick result from two
indexed fields.
The trick is i need to pick top 5 results from each field and display as
suggestions.

If i set fq as field1:XXX AND field2:XXX, the top result comes entirely
from field1 matches.
Is there any other way to get top 5 from field 1 matches and top 5 from
field 2 matched results?

./Zahoor







Re: Struggling with solr 4.0 and zookeeper - multiple solr collection and configs

2012-12-23 Thread Erick Erickson
You haven't indicated whether you tried what I pointed you at, you only
repeated that you tried bootstrapping on the example directory, _not_
wherever you put your multi-core configuration.

I've seen the example I mentioned work like a champ. Really, try it.

I'd think about using the latest solr nightly build too...

Best
Erick


On Sun, Dec 23, 2012 at 2:55 AM, joe.cohe...@gmail.com <
joe.cohe...@gmail.com> wrote:

> If I run
> java -classpath example/solr-webapp/WEB-INF/lib/*
> org.apache.solr.cloud.ZkCLI -cmd bootstrap -zkhost 127.0.0.1:9983
> -solrhome example/solr
>
> on each collcetion, I end up having 3 different configs.
> But when I start solr, it is not able running all 3 collections with  each
> one's configs. It keeps searching collection2's an collection3's config
> under collection1's relative path config.
>
>
> Erick Erickson wrote
> > On the solr cloud page, admittedly down the page a ways, is the line
> > below.
> > Does that apply?
> > Best
> > Erick
> >
> > # try bootstrapping all the conf dirs in solr.xml
> > java -classpath example/solr-webapp/WEB-INF/lib/*
> > org.apache.solr.cloud.ZkCLI -cmd bootstrap -zkhost 127.0.0.1:9983
> > -solrhome example/solr
> >
> >
> >
> > On Wed, Dec 19, 2012 at 1:46 PM,
>
> > joe.cohen.m@
>
> >  <
>
> > joe.cohen.m@
>
> >> wrote:
> >
> >> I'm trying to build the following solr cluster:
> >> 3 collections, with 3 differnet configuration sets, on multiple servers.
> >> It seems that solr can't use different config trees in the zookeeper at
> a
> >> certain time.
> >> Even if I manage to get to a state in which under the 'configs' node in
> >> the
> >> zookeeper, I have 3 config folders with the solr conf files, when I run
> >> solr, it seems like it picks one of them and looks for the other config
> >> files under the single one it picked.
> >>
> >> thus I get messages like " no zookeeper node found in
> >> /configs/collection1cong/collection2conf/solrconfig.xml"
> >> while I was assuming it should see that it has the node :
> >> /configs/collection2conf/solrconfig.xml.
> >>
> >> my zookeper configs node looks like:
> >> configs/
> >> configs/collection1conf
> >> configs/collection2conf
> >> configs/collection3conf
> >> configs/collection1conf/
> > 
> >> configs/collection2conf/
> > 
> >> configs/collection3conf/
> > 
> >>
> >> I've tried many different ways of solr.xml editing and none helped:
> >> 1. setting full paths for each collection -  an error says invalid path
> >> 2. setting relative paths for each collection - an error says cant find
> >> zookeper node because it searchs under
> defaultcollectionpath+relativepath
> >> 3. running with only one core - solr doesnt see the other collections.
> >>
> >>
> >> any idea?
> >> is this even possible with current solr version?
> >>
> >> thanks.
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Struggling-with-solr-4-0-and-zookeeper-multiple-solr-collection-and-configs-tp4028113.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Struggling-with-solr-4-0-and-zookeeper-multiple-solr-collection-and-configs-tp4028113p4028797.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Exact Search

2012-12-23 Thread Erick Erickson
Well, string types are not analyzed at all, so if the town is "Dundee",
this will not match.

If you haven't seen the admin/analysis page, that's the first place I'd
start. Followed by adding &debugQuery=true and looking at the results.

Best
Erick


On Sat, Dec 22, 2012 at 1:04 PM, hank williams  wrote:

> Hi,
> I'm trying to build a facet search, but I'm having some difficulties.
> I can do a free text search over things, but I can build exact queries.
> I know that I have a result that has this data
> iraq treatment of children hong
> kongiraq treatment
> of children hong kong
>  cavendish-bentinck, henrychurchill, winston
>ward, john
>  cavendish-bentinck, henrychurchill, winston
>ward, john
>  nottingham southcabinet
>  department of statedundee
>  stoke-on-trent stoke
>  secretary of state for the colonies />1422064629033467904
>
> And my schema looks like this
>  required="true" multiValued="false" />  type="text_general" indexed="true" stored="true" multiValued="true"/> name="name_long" type="string" indexed="true" stored="true"
> multiValued="true"/> stored="true" multiValued="true"/> indexed="true" stored="true" multiValued="true"/> type="string" indexed="true" stored="true" multiValued="true"/> name="date" type="date" indexed="true" stored="true"
> multiValued="true"/> stored="true" multiValued="true"/> indexed="true" stored="true" multiValued="true"/>
> How can I create an exact query with name_long='churchill, winston' AND
> label_long=''iraq treatment of children hong kong' AND town='dundee'
> When I try
> label_long:*iraq treatment of children hong kong* AND
> name_long:*churchill, winston* AND town:*dundee*
> I get zero results.


Re: Dynamic collections in SolrCloud for log indexing

2012-12-23 Thread Erick Erickson
I think this is one of the primary use-cases for custom sharding. Solr 4.0
doesn't really lend itself to this scenario, but I _believe_ that the patch
for custom sharding has been committed...

That said, I'm not quite sure how you drop off the old shard if you don't
need to keep old data. I'd guess it's possible, but haven't implemented
anything like that myself.

FWIW,
Erick


On Fri, Dec 21, 2012 at 12:17 PM, Upayavira  wrote:

> I'm working on a system for indexing logs. We're probably looking at
> filling one core every month.
>
> We'll maintain a short term index containing the last 7 days - that one
> is easy to handle.
>
> For the longer term stuff, we'd like to maintain a collection that will
> query across all the historic data, but that means every month we need
> to add another core to an existing collection, which as I understand it
> in 4.0 is not possible.
>
> How do people handle this sort of situation where you have rolling new
> content arriving? I'm sure I've heard people using SolrCloud for this
> sort of thing.
>
> Given it is logs, distributed IDF has no real bearing.
>
> Upayavira
>


Re: Bad performance while query pdf solr documents

2012-12-23 Thread Dirk Högemann
Do you really need them all in the response to show them in the results?
As you define them as not stored now this does not seem so.


2012/12/23 Otis Gospodnetic 

> Hi,
>
> You can specify them in solrconfig.xml for your request handler, so you
> don't have to specify it for each query unless you want to override fl.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Dec 23, 2012 4:39 AM, "uwe72"  wrote:
>
> > we have more than hundreds fields...i don't want to put them all to the
> fl
> > parameters
> >
> > is there a other way, like to say return all fields, except the
> fields...?
> >
> > anyhow i will change the field from stored to stored=false in the schema.
> >
> >
> >
> > --
> > View this message in context:
> >
> http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766p4028816.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: solr 4 plugins

2012-12-23 Thread Otis Gospodnetic
Hi,

You could use them for rescoring for sure. Faceting would require a bigger
surgery procedure.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On Dec 23, 2012 9:04 AM, "Giovanni Bricconi" 
wrote:

> This is really interesting!
> Do you know if these added fields can be used in sorting or faceting?
> Tanks
> Il giorno 23/dic/2012 14:08, "Otis Gospodnetic" <
> otis.gospodne...@gmail.com>
> ha scritto:
>
> > Hi,
> >
> > Look into writing a custom SearchComponent.
> >
> > Otis
> > Solr & ElasticSearch Support
> > http://sematext.com/
> > On Dec 23, 2012 2:07 AM, "Eyal Ben-Meir"  wrote:
> >
> > > Hi all,
> > > I want to use solr 4 as a full text search engine, but I need to make
> one
> > > of the query fields to get its answer not from lucene engine but from
> my
> > > own engine. The rest should continue as normal.
> > > Any ideas how to do it? Thanks.
> > >
> >
>


Re: solr 4 plugins

2012-12-23 Thread Giovanni Bricconi
This is really interesting!
Do you know if these added fields can be used in sorting or faceting?
Tanks
Il giorno 23/dic/2012 14:08, "Otis Gospodnetic" 
ha scritto:

> Hi,
>
> Look into writing a custom SearchComponent.
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Dec 23, 2012 2:07 AM, "Eyal Ben-Meir"  wrote:
>
> > Hi all,
> > I want to use solr 4 as a full text search engine, but I need to make one
> > of the query fields to get its answer not from lucene engine but from my
> > own engine. The rest should continue as normal.
> > Any ideas how to do it? Thanks.
> >
>


Re: Bad performance while query pdf solr documents

2012-12-23 Thread uwe72
we have more than hundreds fields...i don't want to put them all to the fl
parameters

is there a other way, like to say return all fields, except the fields...?

anyhow i will change the field from stored to stored=false in the schema.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766p4028816.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multi field query with selective results

2012-12-23 Thread J Mohamed Zahoor
I don't think I hijacked any thread.  it is a new thread. Can you please
enlighten me?

On Sunday, December 23, 2012, Lance Norskog wrote:

> Please start a new thread.
>
> Thanks!
>
> On 12/22/2012 11:03 AM, J Mohamed Zahoor wrote:
>
>> Hi
>>
>> I have a word completion requirement where i need to pick result from two
>> indexed fields.
>> The trick is i need to pick top 5 results from each field and display as
>> suggestions.
>>
>> If i set fq as field1:XXX AND field2:XXX, the top result comes entirely
>> from field1 matches.
>> Is there any other way to get top 5 from field 1 matches and top 5 from
>> field 2 matched results?
>>
>> ./Zahoor
>>
>
>


Re: Bad performance while query pdf solr documents

2012-12-23 Thread Dirk Högemann
You can define the fields to be returned with the fl parameter fl=the,
needed, fields - usually the score and the id...

2012/12/23 uwe72 

> hi
>
> i am indexing pdf documents to solr by tika.
>
> when i do the query in the client with solrj the performance is very bad
> (40
> seconds) to load 100 documents?
>
> Probably because to load all the content. The content i don't need. How can
> i tell the query to don't load the content?
>
> Or other reasons why the performance is so bad?
>
> Regards
> Uwe
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Bad performance while query pdf solr documents

2012-12-23 Thread Upayavira
You make the text field stored="false" in your schema, then reindex.
Then it won't show in search results.

Upayavira

On Sun, Dec 23, 2012, at 09:27 AM, uwe72 wrote:
> >>>your query-time fl parameter. 
> 
> means "don't return" this field?
> 
> because we have many many fields, so probably now i use the default and
> all
> fields will be loaded. so i just want to tell the query to don't load the
> "text" field. I do this with the fl parameter? 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766p4028813.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Bad performance while query pdf solr documents

2012-12-23 Thread uwe72
>>>your query-time fl parameter. 

means "don't return" this field?

because we have many many fields, so probably now i use the default and all
fields will be loaded. so i just want to tell the query to don't load the
"text" field. I do this with the fl parameter? 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766p4028813.html
Sent from the Solr - User mailing list archive at Nabble.com.