RE: SolrCloud deduplication

2012-05-18 Thread Markus Jelsma
you're right. I'll test the patch as soon as possible.
Thanks!

 
 
-Original message-
> From:Chris Hostetter 
> Sent: Fri 18-May-2012 18:20
> To: solr-user@lucene.apache.org
> Subject: RE: SolrCloud deduplication
> 
> 
> : Interesting! I'm watching the issues and will test as soon as they are 
> committed.
> 
> FWIW: it's a chicken and egg problem -- if you could test out the patch in 
> SOLR-2822 with your real world use case / configs, and comment on it's 
> effectiveness, that would go a long way towards my confidence in it.
> 
> 
> -Hoss
> 


Re: Duplicate documents being added even with unique key

2012-05-18 Thread Jack Krupansky
Typically the uniqueKey field is a "string" field type (your schema uses 
"text_general"), although I don't think it is supposed to be a requirement. 
Still, it is one thing that stands out.


Actually, you may be running into some variation of SOLR-1401:

https://issues.apache.org/jira/browse/SOLR-1401

In other words, stick with "string" and stay away from a tokenized (text) 
key.


You could also get duplicates by merging cores or if your "add" has 
allowDups = "true" or overwrite="false".


-- Jack Krupansky

-Original Message- 
From: Parmeley, Michael

Sent: Friday, May 18, 2012 5:50 PM
To: solr-user@lucene.apache.org
Subject: Duplicate documents being added even with unique key

I have a uniquekey set in my schema; however, I am still getting duplicated 
documents added. Can anyone provide any insight into why this may be 
happening?


This is in my schema.xml:


uniquekey

  required="true" />


On startup I get this message in catalina.out:

INFO: unique key field: uniquekey

However, you can see I get multiple documents:



PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510

 



Re: Duplicate documents being added even with unique key

2012-05-18 Thread Erik Hatcher
Your unique key field should be of type "string" not a tokenized type. 

   Erik

On May 18, 2012, at 17:50, "Parmeley, Michael"  wrote:

> I have a uniquekey set in my schema; however, I am still getting duplicated 
> documents added. Can anyone provide any insight into why this may be 
> happening?
> 
> This is in my schema.xml:
> 
> 
> uniquekey
> 
>required="true" />
> 
> On startup I get this message in catalina.out:
> 
> INFO: unique key field: uniquekey
> 
> However, you can see I get multiple documents:
> 
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 


Duplicate documents being added even with unique key

2012-05-18 Thread Parmeley, Michael
I have a uniquekey set in my schema; however, I am still getting duplicated 
documents added. Can anyone provide any insight into why this may be happening?

This is in my schema.xml:


 uniquekey

   

On startup I get this message in catalina.out:

INFO: unique key field: uniquekey

However, you can see I get multiple documents:



PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510


PSR3
1
Skill
510
Body and Soul
1
281
Skill510




Re: Solr 3.6.0 problem with multi-core and json

2012-05-18 Thread dm_tim
I should clarify the error a bit. When I make a select request on my first
core (called core0) using the wt=json parameter I get a 400 response with
the explanation "undefined field: gid". The field gid is not defined in the
schema.xml file of my first core. But, it is defined in the schema.xml file
of my third core (core2). Hopefully, this is a slightly better explanation
of the problem.

T



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-6-0-problem-with-multi-core-and-json-tp3984790p3984793.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: copyField

2012-05-18 Thread Tolga
Oh this one. Yes I have it. 

 myPhone'dan gönderdim

18 May 2012 tarihinde 23:14 saatinde, Yury Kats  şunları 
yazdı:

> On 5/18/2012 4:02 PM, Tolga wrote:
>> Default field? I'm not sure but I think I do. Will have to look. 
> 
> http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field


Solr 3.6.0 problem with multi-core and json

2012-05-18 Thread dm_tim
Howdy,

I have a multi-core set up in Solr 3.6.0 which works fine. That is until I
request the response in json with the "wt=json" parameter. When I do that it
looks like its using the schema.xml file of one of my other cores because it
complains that it can not get a required field that exists in the schema.xml
of one of the other cores.

Has anyone seen this issue before? If so what was the fix. I'm spending far
too much time trying to convert the default xml response into json. I'd like
to have the response returned in json.

Regards,

Tim


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-3-6-0-problem-with-multi-core-and-json-tp3984790.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: copyField

2012-05-18 Thread Yury Kats
On 5/18/2012 4:02 PM, Tolga wrote:
> Default field? I'm not sure but I think I do. Will have to look. 

http://wiki.apache.org/solr/SchemaXml#The_Default_Search_Field


Re: copyField

2012-05-18 Thread Tolga
Default field? I'm not sure but I think I do. Will have to look. 

 myPhone'dan gönderdim

18 May 2012 tarihinde 18:11 saatinde, Yury Kats  şunları 
yazdı:

> On 5/18/2012 9:54 AM, Tolga wrote:
>> Hi,
>> 
>> I've put the line > indexed="true"/> in my schema.xml and restarted Solr, crawled my 
>> website, and indexed (I've also committed but do I really have to 
>> commit?). But I still have to search with content:mykeyword at the admin 
>> interface. What do I have to do so that I can search only with mykeyword?
> 
> Do you have the default field defined?
> 


Re: Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-18 Thread Ravi Solr
Just to give folks an update, we trashed the server having issues and
cloned/rebuild a VM from a sane server and it seems to be running good
for the past 3 days without any issues. We intend to monitor it over
the weekend. If its still stable on Monday, I would blame the issues
it on the server configuration. :-)

Thanks
Ravi Kiran Bhaskar

On Tue, May 15, 2012 at 2:57 PM, Ravi Solr  wrote:
> I have already triple cross-checked  that all my clients are using
> same version as the server which is 3.6
>
> Thanks
>
> Ravi Kiran
>
> On Tue, May 15, 2012 at 2:09 PM, Ramesh K Balasubramanian
>  wrote:
>> I have seen similar errors before when the solr version and solrj version in 
>> the client don't match.
>>
>> Best Regards,
>> Ramesh


Re: why no uppercase filter in solr

2012-05-18 Thread Walter Underwood
In Unicode, uppercasing characters loses information, because there are some 
upper case characters that represent more than one lower case character.

Lower casing text is safe, so always lower-case.

wunder

On May 18, 2012, at 10:41 AM, srinir wrote:

> I am wondering why solr doesnt have an uppercase filter. I want the analyzed
> output to be in upper case to be compatible with legacy data. Will there be
> any problem if i create my own uppercase filter and use it ?
> 







why no uppercase filter in solr

2012-05-18 Thread srinir
I am wondering why solr doesnt have an uppercase filter. I want the analyzed
output to be in upper case to be compatible with legacy data. Will there be
any problem if i create my own uppercase filter and use it ?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/why-no-uppercase-filter-in-solr-tp3984758.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: SolrCloud deduplication

2012-05-18 Thread Chris Hostetter

: Interesting! I'm watching the issues and will test as soon as they are 
committed.

FWIW: it's a chicken and egg problem -- if you could test out the patch in 
SOLR-2822 with your real world use case / configs, and comment on it's 
effectiveness, that would go a long way towards my confidence in it.


-Hoss


Re: solrj with incorrect schema

2012-05-18 Thread Shawn Heisey

On 5/18/2012 8:50 AM, Mark Miller wrote:

On May 18, 2012, at 10:26 AM, Shawn Heisey wrote:

On 5/18/2012 1:42 AM, Jamel ESSOUSSI wrote:

I have an incorrect schema -->   a missing field :

and when I add a documents (UpdateResponse ur = solrServer.add(docs), I have
not be able to catch exception in solrj and the UpdateResponse cannot handle
result in UpdateResponse.

I use solr-core3.6, solr-solrj3.6 and solr.war4.0

Which SolrServer implementation are you using?  If you are using 
ConcurrentUpdateSolrServer (or its httpclient 3.x predecessor, 
StreamingUpdateSolrServer), your program will never be able to detect when an 
error occurs.

That is not strictly true. There is an error handling method you can override - 
currently it simply logs an exception. It's not an ideal solution, you won't have 
fine grained doc ->  error detection, but you can in fact detect that an error 
has occurred.


That's definitely an option, but I haven't seen any documentation 
telling people how to do it, and I haven't reasoned out how to do it.  I 
think it would be better if error handling were built in, even if you 
can't connect it to a specific request and it's turned off by default.


I have already submitted a patch on SOLR-3284 that causes 
ConcurrentUpdateSolrServer to throw an exception on a subsequent request 
if a prior update failed.  If you are only doing one update request, you 
would have to follow it with a commit in order to receive the error.  My 
patch makes this new behavior the default, but it's a one-line change if 
that's considered too disruptive.


Thanks,
Shawn



Re: problem in replication

2012-05-18 Thread Tomás Fernández Löbbe
With replication every 15 minutes you could still do some autowarming. But
if autowarming was the problem you should see only the first couple of
queries slow, after that it should go back to normal, is this what you are
seeing?

Are your queries very complex? Do you facet in many fields? are those
single valued or multivalued?

On Fri, May 18, 2012 at 8:43 AM, shinkanze wrote:

> hi All ,
>
>
> To provide realtime data we are delta indexing every 15 minutes and than
> replicating it to the slave .
>
> *Auto warmup count is 0 . Dismax queries are getting slow really slow (30
> TO
> 90 seconds) and if i stop the delta replication, then the dismax queries
> are
> getting fast . If i run queries on a standalone server they are fast .*
>
> what may be the issue
>
> Need help on a tight time line
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/problem-in-replication-tp3984654.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


RE: SolrCloud deduplication

2012-05-18 Thread Markus Jelsma
Hi,

Interesting! I'm watching the issues and will test as soon as they are 
committed.

Thanks!

 
 
-Original message-
> From:Mark Miller 
> Sent: Fri 18-May-2012 16:05
> To: solr-user@lucene.apache.org; Markus Jelsma 
> Subject: Re: SolrCloud deduplication
> 
> Hey Markus -
> 
> When I ran into a similar issue with another update proc, I created 
> https://issues.apache.org/jira/browse/SOLR-3215 so that I could order things 
> to avoid this. I have not committed this yet though, in favor of waiting for 
> https://issues.apache.org/jira/browse/SOLR-2822
> 
> Go vote? :)
> 
> On May 18, 2012, at 7:49 AM, Markus Jelsma wrote:
> 
> > Hi,
> > 
> > Deduplication on SolrCloud through the SignatureUpdateRequestProcessor is 
> > not 
> > functional anymore. The problem is that documents are passed multiple times 
> > through the URP and the digest field is added as if it is an multi valued 
> > field. 
> > If the field is not multi valued you'll get this typical error. Changing 
> > the 
> > order or URP's in the chain does not solve the problem.
> > 
> > Any hints on how to resolve the issue? Is this a problem in the 
> > SignatureUpdateRequestProcessor and does it need to be updated to work with 
> > SolrCloud? 
> > 
> > Thanks,
> > Markus
> 
> - Mark Miller
> lucidimagination.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 


Re: copyField

2012-05-18 Thread Yury Kats
On 5/18/2012 9:54 AM, Tolga wrote:
> Hi,
> 
> I've put the line  indexed="true"/> in my schema.xml and restarted Solr, crawled my 
> website, and indexed (I've also committed but do I really have to 
> commit?). But I still have to search with content:mykeyword at the admin 
> interface. What do I have to do so that I can search only with mykeyword?

Do you have the default field defined?



Re: StripHTML and HTMLStripCharFilterFactory

2012-05-18 Thread Jack Krupansky
It is simply a question of whether or not you wish to have the raw HTML stored 
in the field so that it can be returned to the application for display 
purposes. If you simply want the HTML to do away as soon as possible, use 
“stripHTML”, but then there is no need to use the factory on the field in the 
Solr schema. But, if you do want to preserve the HTML for later output, don’t 
“stripHTML”, but do use the factory in the Solr schema since the index should 
not have the HTML even though the “stored” field value will retain the full 
HTML.

-- Jack Krupansky

From: Sergio Martín Cantero 
Sent: Friday, May 18, 2012 10:53 AM
To: solr-user@lucene.apache.org 
Subject: StripHTML and HTMLStripCharFilterFactory

Hello.
Could you tell me the difference between this two?

1) Having a DIH with a field in data-import-config.xml like this:


b) Having the Schema.xml with a field like this:








I assume when I call to the DIH, it first removes the HMTL, and then, when 
indexing, the HTML should be removed again, but the HTML was already removed by 
stripHML in data-import-config.
Si, doesn it make sense to declare a field as stripHTML=true when than field 
will be stored in a field with a HTMLStripCharFilterFactory?

Thanks for you help.


  
  Sergio Martín Cantero Office (ES) +34 91 733 73 97 
  playence Spain SL sergio.mar...@playence.com 
  Calle Vicente Gaceo 19 
 
  28029 Madrid - España   


Re: Indexing & Searching MySQL table with Hindi and English data

2012-05-18 Thread Jack Krupansky
Check the analyzers for the field types containing Hindi text to be sure 
that they are not using a character mapping or "folding" filter that might 
mangle the Hindi characters. Post the field type, say for the "title" field.


Also, try manually (using curl or the post jar) adding a single document 
that has Hindi data and see if that works.


-- Jack Krupansky

-Original Message- 
From: KP Sanjailal

Sent: Thursday, May 17, 2012 5:55 AM
To: solr-user@lucene.apache.org
Subject: Indexing & Searching MySQL table with Hindi and English data

Hi,

I tried to setup indexing of MySQL tables in Apache Solr 3.6.

Everything works fine but text in Hindi script (only some 10% of total
records) not getting indexed properly.

A search with keyword in Hindi retrieve emptly result set.  Also a
retrieved hindi record displays junk characters.

The database tables contains bibliographical details of books such as
title, author, publisher, isbn, publishing place, series etc. and out of
the total records about 10% of records contains text in Hindi in title,
author, publisher fields.

Example:

*Search Results from MySQL using PHP*

  1.

 *Title:* सौर ऊर्जा Saur
oorja
*Author(s):* विनोद कुमार मिश्र MISHRA (VK) *Material:* Books **  **
*Search Results from Apache Solr (searched using keyword in English)*

 1.

 *Title:* सौर ऊर्जा Saur
oorja
*Author(s):* विनोद कुमार मिश्र MISHRA (VK) *
Material:* Books


How do I go about solving this language problem.

Thanks in advace.

K. P. Sanjailal
--



StripHTML and HTMLStripCharFilterFactory

2012-05-18 Thread Sergio Martín Cantero

  
  
Hello.
Could you tell me the difference between this two?

1) Having a DIH with a field in data-import-config.xml like this:


b) Having the Schema.xml with a field like this:
    
        
            
        
    

    

I assume when I call to the DIH, it first removes the HMTL, and
then, when indexing, the HTML should be removed again, but the HTML
was already removed by stripHML in data-import-config.
Si, doesn it make sense to declare a field as stripHTML=true when
than field will be stored in a field with a
HTMLStripCharFilterFactory?

Thanks for you help.


  
  

  
 


  

  

  Sergio Martín Cantero


  Office (ES) +34 91 733 73
97

  
  
playence
  Spain SL
sergio.mar...@playence.com
  
  

  Calle Vicente Gaceo 19

 

  
  

  28029 Madrid - España

 
  

  

  

  

  



Re: solrj with incorrect schema

2012-05-18 Thread Mark Miller

On May 18, 2012, at 10:26 AM, Shawn Heisey wrote:

> On 5/18/2012 1:42 AM, Jamel ESSOUSSI wrote:
>> I have an incorrect schema -->  a missing field :
>> 
>> and when I add a documents (UpdateResponse ur = solrServer.add(docs), I have
>> not be able to catch exception in solrj and the UpdateResponse cannot handle
>> result in UpdateResponse.
>> 
>> I use solr-core3.6, solr-solrj3.6 and solr.war4.0
> 
> Which SolrServer implementation are you using?  If you are using 
> ConcurrentUpdateSolrServer (or its httpclient 3.x predecessor, 
> StreamingUpdateSolrServer), your program will never be able to detect when an 
> error occurs.  

That is not strictly true. There is an error handling method you can override - 
currently it simply logs an exception. It's not an ideal solution, you won't 
have fine grained doc -> error detection, but you can in fact detect that an 
error has occurred.

> You must use HttpSolrServer if exception handling is a requirement.
> 
> If you are already using HttpSolrServer or CommonsHttpSolrServer, then I fear 
> your problem is beyond my experience.
> 
> Thanks,
> Shawn
> 

- Mark Miller
lucidimagination.com













Re: copyField

2012-05-18 Thread Tolga
I'll make sure to do that. Thanks

 myPhone'dan gönderdim

18 May 2012 tarihinde 17:40 saatinde, "Jack Krupansky" 
 şunları yazdı:

> Did you also delete all existing documents from the index? Maybe your crawl 
> did not re-index documents that were already in the index or that hadn't 
> changed since the last crawl, leaving the old index data as it was before the 
> change.
> 
> -- Jack Krupansky
> 
> -Original Message- From: Tolga
> Sent: Friday, May 18, 2012 9:54 AM
> To: solr-user@lucene.apache.org
> Subject: copyField
> 
> Hi,
> 
> I've put the line  indexed="true"/> in my schema.xml and restarted Solr, crawled my
> website, and indexed (I've also committed but do I really have to
> commit?). But I still have to search with content:mykeyword at the admin
> interface. What do I have to do so that I can search only with mykeyword?
> 
> Regards, 


Re: copyField

2012-05-18 Thread Jack Krupansky
Did you also delete all existing documents from the index? Maybe your crawl 
did not re-index documents that were already in the index or that hadn't 
changed since the last crawl, leaving the old index data as it was before 
the change.


-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, May 18, 2012 9:54 AM
To: solr-user@lucene.apache.org
Subject: copyField

Hi,

I've put the line  in my schema.xml and restarted Solr, crawled my
website, and indexed (I've also committed but do I really have to
commit?). But I still have to search with content:mykeyword at the admin
interface. What do I have to do so that I can search only with mykeyword?

Regards, 



Re: Problem with query negation

2012-05-18 Thread Ramprakash Ramamoorthy
On Fri, May 18, 2012 at 7:26 PM, Ahmet Arslan  wrote:

> > > I don't think request handler should be a problem. I
> > have just used the *q
> > *parameter as follows.
> >
> > String q = params.get(CommonParams.Q);
> > IndexSchema schema = req.getSchema();
> > Query query = new QueryParsing().parseQuery(q, schema);
> >
> > Hope there shouldn't be a problem with the above!
>
> Solr converts top level negative query (-field:something) into q=+*:*
> -field:something
>
> It seems that you are missing that part.
>
> org.apache.solr.search.QueryUtils
>
> /** Fixes a negative query by adding a MatchAllDocs query clause.
>   * The query passed in *must* be a negative query.
>   */
>  public static Query fixNegativeQuery(Query q) {
>BooleanQuery newBq = (BooleanQuery)q.clone();
>newBq.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
>return newBq;
>  }
>

Oh thats great :)

Thank you very much! You made my day :)

-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Project Trainee,
Zoho Corporation.
+91 9626975420


Re: Search plain text

2012-05-18 Thread Jack Krupansky

The dismax query parser should be good enough.

-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, May 18, 2012 8:46 AM
To: solr-user@lucene.apache.org
Subject: Re: Search plain text

My website is http://liseyazokulu.sabanciuniv.edu/ it has the word
barınma in it, and I want to be able to search for that by just typing
"barınma" in the admin interface.

On 5/18/12 3:40 PM, Jack Krupansky wrote:
Could you give us some examples of the kinds of search you want to do? 
Besides, keywords and quoted phrases?


The dismax query parser may be good enough.

-- Jack Krupansky

-Original Message- From: Tolga
Sent: Friday, May 18, 2012 6:27 AM
To: solr-user@lucene.apache.org
Subject: Search plain text

Hi,

I have 96 documents added to index, and I would like to be able to
search in them in plain text, without using complex search queries. How
can I do that?

Regards, 




Re: Use DIH with more than one entity at the same time

2012-05-18 Thread Sergio Martín Cantero

I see.
What I need is not multiple threads for one entity but multiple entities 
at the same time.


What I have done is rename the DIH for each of the entities in 
solrconfig, altough the are using the same data-import-confg.xml.

Something like:

class="org.apache.solr.handler.dataimport.DataImportHandler">


data-import-config.xml



class="org.apache.solr.handler.dataimport.DataImportHandler">


data-import-config.xml



Then I can run each entity at the same time with:
http://localhost:8080/solr/dataimportUsers?command=full-import&entity=users
http://localhost:8080/solr/dataimportProducts?command=full-import&entity=products

Being users and products entities defined in the same data-import-config.xml

This way, I don´t need to wait  to run products until users has finished.
This allows me to call full-import for users lets say each 15 min and 
for products each 10 min, and don´t need to wait until one has finsihed. 
Both can be overlaping.


Any drawback to this approach?

Thanks!!

Sergio

El 18/05/12 16:21, Dyer, James escribió:


"threads" lets you run a single entity with multiple threads, so tis 
probably not what you wanted.What we've done here is partition the 
source data and then we have multiple handlers running at the same 
time, each processing its own partition.So we multi-thread the import 
without using the "threads" parameter.


Even if this sounds like something useful, I recommend against using 
it."threads" has tons of bugs, although some fixes were made for Solr 
3.6.For Solr 4.0 this feature is removed.


*James Dyer*

E-Commerce Systems

Ingram Content Group

(615) 213-4311

*From:*Sergio Martín Cantero [mailto:sergio.mar...@playence.com]
*Sent:* Friday, May 18, 2012 6:23 AM
*To:* solr-user@lucene.apache.org
*Cc:* Dyer, James
*Subject:* Re: Use DIH with more than one entity at the same time

What the wiki indicates actually works, altough it´s not what I 
wanted. I have tried it and works fine.


I have also tried Jack´s approach and also works fine (and is what I 
was looking for :-)


Still, I have one more question. You wrote: " This is a 1.4.1 
installation, back when there was no "threads" option in DIH. ". I´m 
using 3.5 Solr. What would the use of threads change. How could I take 
advantage ot it, instead of declaring various DIHs in SolrConfgi.xml?


Thanks a lot!


El 17/05/12 18:33, Dyer, James escribió:

The wiki here indicates that you can specify "entity" more than once on the 
request and it will run multiple entities at the same time, in the same handler:   
http://wiki.apache.org/solr/DataImportHandler#Commands
  
But I can't say for sure that this actually works!   Having been in the DIH code, I would think such a feature is buggy at best, if it works at all.   But if you try it let us know how it works for you.   Also, if anyone else out there is using multiple "entity" parameters to get entities running in parallel, I'd be interested in hearing about it.
  
But the approach taken in the link Jack sites below does work.   Its a pain to set it up though.
  
James Dyer

E-Commerce Systems
Ingram Content Group
(615) 213-4311
  
From: Jack Krupansky [mailto:j...@basetechnology.com]

Sent: Thursday, May 17, 2012 10:21 AM
To:solr-user@lucene.apache.org  
Subject: Re: Use DIH with more than one entity at the same time
  
Okay, the answer is “Yes, sort of, but...”
  
“One annoyance is because of how DIH is designed, you need a separate handler set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in advance how many DIH instances you want to run, which config files they'll use, etc.”
  
See:

http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html
  
-- Jack Krupansky
  
From: Sergio Martín Cantero

Sent: Thursday, May 17, 2012 11:07 AM
To:solr-user@lucene.apache.org  

Cc: Jack Krupansky
Subject: Re: Use DIH with more than one entity at the same time
  
Thanks Jack, but that´s not what I want.
  
I don´t want multiple entities in one invocation, but two simultaneous invocations of the DIH with different entities.
  
Thanks.

[cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]
[cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]
Sergio Martín Cantero
  
Office (ES) +34 91 733 73 97
  
playence Spain SL
  
sergio.mar...@playence.com  
  
Calle Vicente Gaceo 19
  
28029 Madrid - España
  
  
  
  
El 17/05/12 17:04, Jack Krupansky escribió:

Yes. From the doc:
  
"Multiple 'entity' parameters can be passed on to run multiple entities at once. If nothing is passed, all entities are executed."
  
See:

http://wiki.apache.org/solr/DataImportHandler
  
But that is one invocation of DIH, not two separate updates as you tried.
  
-- Jack Krupansky
  
--

Re: Indexing & Searching MySQL table with Hindi and English data

2012-05-18 Thread Gora Mohanty
On 18 May 2012 18:43, KP Sanjailal  wrote:
> Hi,
>
> Even after setting the URIEncoding="UTF-8" in the tomcat /conf/server.xml,
> indexing and search of Hindi characters doesn't work.  The output furnished
> below:
[...]

How are you indexing your data into Solr?
Where does your mysql database reside, and
what is the encoding used for the tables?

For the record, I can confirm that we have used
Hindi in Solr without issues, so in your case,
there must be some place where a character
encoding/decoding error is happening.

Regards,
Gora


Re: Merging two DocSets in solr

2012-05-18 Thread Ramprakash Ramamoorthy
On Sun, May 13, 2012 at 4:45 PM, Dmitry Kan  wrote:

> Are you operating inside the SOLR source code or on the (solrj) client
> side?
>
> SOLR source code!

> On Fri, May 11, 2012 at 12:46 PM, Ramprakash Ramamoorthy <
> youngestachie...@gmail.com> wrote:
>
> > Dear all,
> >
> >  I get two different DocSets from two different searchers. I need
> > to merge them into one and get the facet counts from the merged
> > docSets. How do I do it? Any pointers would be appreciated.
> >
> > --
> > With Thanks and Regards,
> > Ramprakash Ramamoorthy,
> > Project Trainee,
> > Zoho Corporation.
> > +91 9626975420
> >
>
>
>
> --
> Regards,
>
> Dmitry Kan
>



-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Project Trainee,
Zoho Corporation.
+91 9626975420


Re: solrj with incorrect schema

2012-05-18 Thread Shawn Heisey

On 5/18/2012 1:42 AM, Jamel ESSOUSSI wrote:

I have an incorrect schema -->  a missing field :

and when I add a documents (UpdateResponse ur = solrServer.add(docs), I have
not be able to catch exception in solrj and the UpdateResponse cannot handle
result in UpdateResponse.

I use solr-core3.6, solr-solrj3.6 and solr.war4.0


Which SolrServer implementation are you using?  If you are using 
ConcurrentUpdateSolrServer (or its httpclient 3.x predecessor, 
StreamingUpdateSolrServer), your program will never be able to detect 
when an error occurs.  You must use HttpSolrServer if exception handling 
is a requirement.


If you are already using HttpSolrServer or CommonsHttpSolrServer, then I 
fear your problem is beyond my experience.


Thanks,
Shawn



Re: CloudSolrServer not working with standalone Zookeeper

2012-05-18 Thread Mark Miller
Seems something is stopping the connection from occurring? Tests are constantly 
running and doing this using an embedded zk server - and I know more than a few 
people using an external zk setup. I'd have to guess something in your env or 
URL is causing this?


On May 16, 2012, at 3:11 PM, Daniel Brügge wrote:

> OK, it's also not working with an internal started Zookeeper.
> 
> On Wed, May 16, 2012 at 8:29 PM, Daniel Brügge <
> daniel.brue...@googlemail.com> wrote:
> 
>> Hi,
>> 
>> I am just playing around with SolrCloud and have read in articles like
>> 
>> http://www.lucidimagination.com/blog/2012/03/05/scaling-solr-indexing-with-solrcloud-hadoop-and-behemoth/that
>>  it
>> is sufficient to create the connection to the Zookeeper instance and not
>> to the Solr instance.
>> When I try to connect to my standalone  Zookeeper instance (not started
>> with a Solr instance and "-DzkRun") I am getting this error:
>> 
>> Caused by: java.util.concurrent.TimeoutException: Could not connect to
>>> ZooKeeper
>> 
>> 
>> I am also getting this error when I try to connect directly to one of the
>> Solr instances.
>> 
>> My code looks like this:
>> 
>>solr = new CloudSolrServer("myzkhost:2181");
>>((CloudSolrServer) solr).setDefaultCollection("collection1");
>> 
>> I am working with the latest Solr trunk version (
>> https://builds.apache.org/view/S-Z/view/Solr/job/Solr-trunk/1855/)
>> 
>> Do I need to start the zookeeper in Solr to keep this working?
>> 
>> Thanks & regards
>> 
>> Daniel
>> 

- Mark Miller
lucidimagination.com













Re: Adding config to SolrCloud without creating any shards/slices

2012-05-18 Thread Mark Miller

On May 18, 2012, at 3:06 AM, Per Steffensen wrote:

> First of all, sorry about the subject of this discussion. It should have been 
> something like "Adding config to SolrCloud without starting a Solr server"
> 
> Mark Miller skrev:
>> k
>> On May 16, 2012, at 5:35 AM, Per Steffensen wrote:
>> 
>>  
>>> Hi
>>> 
>>> We want to create a Solr config in ZK during installation of our product, 
>>> but we dont want to create any shards in that phase. We will create shards 
>>> from our application when it starts up and also automatically maintain the 
>>> set of shards from our application (which uses SolrCloud). The only way we 
>>> know to create a Solr config in ZK is to spin up a  Solr with  system 
>>> properties zkHost, bootstrap_confdir and collection.configName. Is there 
>>> another, more API-ish, way of creating a Solr config in ZK?
>>> 
>>> Regards, Per Steffensen
>>>
>> 
>> I've started some work on this, but I have not finished.
>> 
>> There is a main method in ZkController that has some initial code. Currently 
>> it just lets you upload a specifically named config set directory - I would 
>> also like to add the same multi core config set upload option we have on 
>> startup - where it reads solr.xml, finds all the config dirs and uploads 
>> them, and links each collection to a config set named after it.
>>  
> Yeah ok, I just want the config created - no shards/slices/collections.

That's all that is created.

>> Technically, you could use any tool to set this up - there are a variety of 
>> options in the zk world - you just have to place the config files under the 
>> right node.
> I would really want to do it through Solr. This is the correct way, I think. 
> So that, when you change your "strategy" e.g. location or format of configs 
> in ZK, I will automatically inherit that.

We have to commit to some level of back compat support with our ZK layout 
regardless. We expect to expose it.

>> There is one other tricky part though - the collection has to be set to the 
>> right config set name. This is specified on the collection node in 
>> ZooKeeper. When creating a new collection, you can specify this as a param. 
>> If none is set and there is only one config set, that one config set is 
>> used. However, some link must be made, and it is not done automatically with 
>> your initial collections in solr.xml unless there is only one config set.
>>  
> I know about that, and will use Solr to create collections. I just want the 
> config established in ZK before that, and not create the config "during the 
> process of creating a collection".

Yeah, but doing what you want is tricky because of that point.

>> So now I'm thinking perhaps we should default the config set name to the 
>> collection name. Then if you simply use the collection name initially when 
>> you upload the set, no new linking is needed. If you don't like that, you 
>> can explicitly override what config set to use. Convention would be to name 
>> your config sets after your collection name, but extra work would allow you 
>> to do whatever you want.
>>  
> I want several collections to use the same config, so I would have to do that 
> extra work.

I'm not sure I have a great solution yet then. How are you creating your 
initial collections?

If you are creating them on the fly with solrj (a collections api coming soon 
by the way), then you can simply give the collection set name to use when you 
do.

If you are creating them in solr.xml so that they exist on startup, and some 
have to share config sets, I think we need to add something else. Perhaps a 
hint property you could add to each core in solr.xml that caused a link to be 
made when the core is first started? Since the config sets will be uploaded 
first, we need some way of indicating to each collection which set to end up 
using.

>> You can find an example of the ZkContoller main method being used in 
>> solr/cloud-dev scripts. The one caveat is that we pass an extra param to 
>> solrhome and briefly run a ZkServer within the ZkController#main method 
>> since we don't have an external ensemble. Normally this would not make sense 
>> and you would want to leave that out. I need to clean this all up (the 
>> ZkController params) and document it on the wiki as soon as I make these 
>> couple tweaks though.
>>  
> Ok

I've actually made the changes that I said I would. So now, this would be 
pretty easy if each collection had it's own config set. Let's work out how to 
make your case a little easier as well.

> 
> Thanks, Mark
>> - Mark Miller
>> lucidimagination.com
>>  
> 
> Regards, Per Steffensen
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>  
> 

- Mark Miller
lucidimagination.com













Re: Question about wildcards

2012-05-18 Thread Ahmet Arslan


> I have a field that was indexed with the string
> ".2231-7". When i
> search using '*' or '?' like this "*2231-7" the query
> don't returns
> results. When i remove "-7" substring and search agin using
> "*2231" the
> query returns. Finally when i search using 
> ".2231-7" the query returns
> too.

May be standard tokenizer is splitting .2231-7 into multiple tokens?
You can check that admin/analysis page.

May be -7 is treated as negative clause? You can check that with &debugQuery=on



Re: SolrCloud deduplication

2012-05-18 Thread Mark Miller
Hey Markus -

When I ran into a similar issue with another update proc, I created 
https://issues.apache.org/jira/browse/SOLR-3215 so that I could order things to 
avoid this. I have not committed this yet though, in favor of waiting for 
https://issues.apache.org/jira/browse/SOLR-2822

Go vote? :)

On May 18, 2012, at 7:49 AM, Markus Jelsma wrote:

> Hi,
> 
> Deduplication on SolrCloud through the SignatureUpdateRequestProcessor is not 
> functional anymore. The problem is that documents are passed multiple times 
> through the URP and the digest field is added as if it is an multi valued 
> field. 
> If the field is not multi valued you'll get this typical error. Changing the 
> order or URP's in the chain does not solve the problem.
> 
> Any hints on how to resolve the issue? Is this a problem in the 
> SignatureUpdateRequestProcessor and does it need to be updated to work with 
> SolrCloud? 
> 
> Thanks,
> Markus

- Mark Miller
lucidimagination.com













Re: Distributed search between solrclouds?

2012-05-18 Thread Mark Miller
Yeah, you can still override the shards param and search anywhere AFAIK. I have 
not tried it recently, but it should work.

On May 18, 2012, at 7:57 AM, Darren Govoni wrote:

> The thought here is to distribute a search between two different
> solrcloud clusters and get ordered ranked results between them.
> It's possible?
> 
> On Tue, 2012-05-15 at 18:47 -0400, Darren Govoni wrote:
>> Hi,
>>  Would distributed search (the old way where you provide the solr host
>> IP's etc.) still work between different solrclouds?
>> 
>> thanks,
>> Darren
>> 
> 
> 

- Mark Miller
lucidimagination.com













Re: Problem with query negation

2012-05-18 Thread Ahmet Arslan
> > I don't think request handler should be a problem. I
> have just used the *q
> *parameter as follows.
> 
> String q = params.get(CommonParams.Q);
> IndexSchema schema = req.getSchema();
> Query query = new QueryParsing().parseQuery(q, schema);
> 
> Hope there shouldn't be a problem with the above!

Solr converts top level negative query (-field:something) into q=+*:* 
-field:something

It seems that you are missing that part.

org.apache.solr.search.QueryUtils

/** Fixes a negative query by adding a MatchAllDocs query clause.
   * The query passed in *must* be a negative query.
   */
  public static Query fixNegativeQuery(Query q) {
BooleanQuery newBq = (BooleanQuery)q.clone();
newBq.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
return newBq;
  }


copyField

2012-05-18 Thread Tolga

Hi,

I've put the line indexed="true"/> in my schema.xml and restarted Solr, crawled my 
website, and indexed (I've also committed but do I really have to 
commit?). But I still have to search with content:mykeyword at the admin 
interface. What do I have to do so that I can search only with mykeyword?


Regards,


Re: Problem with query negation

2012-05-18 Thread Ramprakash Ramamoorthy
On Fri, May 18, 2012 at 6:20 PM, Ahmet Arslan  wrote:

> > I am using the standard LuceneQParserPlugin and I have a
> > custom request
> > handler. I use solr 3.5
>
> I would test the same query with standard request handler. May be
> something is your custom request handler?
>
> I don't think request handler should be a problem. I have just used the *q
*parameter as follows.

String q = params.get(CommonParams.Q);
IndexSchema schema = req.getSchema();
Query query = new QueryParsing().parseQuery(q, schema);

Hope there shouldn't be a problem with the above!


> &debugQuery=on would help too.
>



-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Project Trainee,
Zoho Corporation.
+91 9626975420


Re: Question about cache

2012-05-18 Thread Anderson vasconcelos
Hi Kuli

Is Just raising. Thanks for the explanation.

Regards

Anderson

2012/5/11 Shawn Heisey 

> On 5/11/2012 9:30 AM, Anderson vasconcelos wrote:
>
>> HI  Kuli
>>
>> The free -m command gives me
>>total   used   free sharedbuffers
>> cached
>> Mem:  9991   9934 57  0 75   5759
>> -/+ buffers/cache:   4099   5892
>> Swap: 8189   3395   4793
>>
>> You can see that has only 57m free and 5GB cached.
>>
>> In top command, the glassfish process used 79,7% of memory:
>>
>>  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+
>> COMMAND
>>  4336 root  21   0 29.7g 7.8g 4.0g S 0.3  79.7   5349:14
>> java
>>
>>
>> If i increase the memory of server for more 2GB, the SO will be use this
>> additional 2GB in cache? I need to increse the memory size?
>>
>
> Are you having a problem you need to track down, or are you just raising a
> concern because your memory usage is not what you expected?
>
> It is 100% normal for a Linux system to show only a few megabytes of
> memory free.  To make things run faster, the OS caches disk data using
> memory that is not directly allocated to programs or the OS itself.  If a
> program requests memory, the OS will allocate it immediately, it simply
> forgets the least used part of the cache.
>
> Windows does this too, but Microsoft decided that novice users would freak
> out if the task manager were to give users the true picture of memory
> usage, so they exclude disk cache when calculating free memory.  It's not
> really a lie, just not the full true picture.
>
> A recent version of Solr (3.5, if I remember right) made a major change in
> the way that the index files are accessed.  The way things are done now is
> almost always faster, but it makes the memory usage in the top command
> completely useless.  The VIRT memory size includes all of your index files,
> plus all the memory that the java process is capable of allocating, plus a
> little that i can't quite account for.  The RES size is also bigger than
> expected, and I'm not sure why.
>
> Based on the numbers above, I am guessing that your indexes take up
> 15-20GB of disk space.  For best performance, you would want a machine with
> at least 24GB of RAM so that your entire index can fit into the OS disk
> cache.  The 10GB you have (which leaves the 5.8 GB for disk cache as you
> have seen) may be good enough to cache the frequently accessed portions of
> your index, so your performance might be just fine.
>
> Thanks,
> Shawn
>
>


Re: Indexing & Searching MySQL table with Hindi and English data

2012-05-18 Thread KP Sanjailal
Hi,

Even after setting the URIEncoding="UTF-8" in the tomcat /conf/server.xml,
indexing and search of Hindi characters doesn't work.  The output furnished
below:


 
*-*

 
*-*

 * * *0*
 * * *11*
 
*-*

 * * *on*
 * * *0*
 * * *acc_no: H-100*
 * * *10*
 * * *2.2*
* * 
* * 
 
*-*

 
*-*

 
*-*

 * * *H-100*
* * 
 * * 
 * * *विनोद कुमार मिश्र
MISHRA (VK)*
 * * *26913*
 * * *BYC,EA1-Ha P04*
 
*-*

 * * *Hindi Books*
* * 
 
*-*

 * * *1*
* * 
 * * 
 
*-*

 * * *RS*
* * 
 * * 
 * * *FOSTIS-II Floor*
 * * *26913*
 * * 
 * * *Books*
 * * *139*
 * * *150.0*
 * * *150.00,USD*
 * * *2004*
 * * *नई दिल्ली New Delhi*
 * * *ग्रंथ अकादमी Granth
Akademi*
 * * *Hindi*
 * * *विनोद कुमार
मिश्र MISHRA (VK)*
 * * 
 * * 
 
*-*

 * * *On Shelf*
* * 
 * * *सौर ऊर्जा Saur oorja*
 * * *Solar Energy, Astrophysics, Astrophysics, Solar
Energy, Energy, Astrophysics*
 * * 
 * * *East Wing*
* * 
* * 
* * 
Pleased help me in sorting out this problem.


Sanjailal KP
--
On Fri, May 18, 2012 at 4:05 PM, KP Sanjailal  wrote:

> Thank you for your guidance.  Actually, I am using Apache Solr with jetty
> 5.1.  I will try to setup solr using tomcat incorporating the configuration
> suggested below by you.
>
> Thank you
>
> Sanjailal K P
> --
>
>  On Thu, May 17, 2012 at 7:24 PM, Ahmet Arslan  wrote:
>
>> > A search with keyword in Hindi retrieve emptly result
>> > set.  Also a
>> > retrieved hindi record displays junk characters.
>>
>> Could it be URIEncoding setting of your servlet container?
>> http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
>>
>
>


Re: Problem with query negation

2012-05-18 Thread Ahmet Arslan
> I am using the standard LuceneQParserPlugin and I have a
> custom request
> handler. I use solr 3.5

I would test the same query with standard request handler. May be something is 
your custom request handler? 

&debugQuery=on would help too.


Re: Search plain text

2012-05-18 Thread Tolga
My website is http://liseyazokulu.sabanciuniv.edu/ it has the word 
barınma in it, and I want to be able to search for that by just typing 
"barınma" in the admin interface.


On 5/18/12 3:40 PM, Jack Krupansky wrote:
Could you give us some examples of the kinds of search you want to do? 
Besides, keywords and quoted phrases?


The dismax query parser may be good enough.

-- Jack Krupansky

-Original Message- From: Tolga
Sent: Friday, May 18, 2012 6:27 AM
To: solr-user@lucene.apache.org
Subject: Search plain text

Hi,

I have 96 documents added to index, and I would like to be able to
search in them in plain text, without using complex search queries. How
can I do that?

Regards,


Re: Problem with query negation

2012-05-18 Thread Ramprakash Ramamoorthy
On Fri, May 18, 2012 at 6:03 PM, Ahmet Arslan  wrote:

> > > Why don't you just use /solr/select/?q=-HOSTID:302
> > >
> >
> > Tried the same right at start, but never worked :(
>
> q=-HOSTID:302 and q=+*:* -HOSTID:302 should return same result set.
> Which solr version and query parser are you using?
>

I am using the standard LuceneQParserPlugin and I have a custom request
handler. I use solr 3.5

-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Engineer Trainee,
Zoho Corporation.
+91 9626975420


Re: Unknown field

2012-05-18 Thread Jack Krupansky

You could enable the "*" dynamic field which accepts all field names.

-- Jack Krupansky

-Original Message- 
From: Tolga 
Sent: Friday, May 18, 2012 2:54 AM 
To: solr-user@lucene.apache.org 
Subject: Unknown field 


Hi,

Is there a way what fields to add to schema.xml prior to crawling with 
nutch, rather than crawling over and over again and fixing the fields 
one by one?


Regards,


Re: Search plain text

2012-05-18 Thread Jack Krupansky
Could you give us some examples of the kinds of search you want to do? 
Besides, keywords and quoted phrases?


The dismax query parser may be good enough.

-- Jack Krupansky

-Original Message- 
From: Tolga

Sent: Friday, May 18, 2012 6:27 AM
To: solr-user@lucene.apache.org
Subject: Search plain text

Hi,

I have 96 documents added to index, and I would like to be able to
search in them in plain text, without using complex search queries. How
can I do that?

Regards, 



Re: Problem with query negation

2012-05-18 Thread Ahmet Arslan
> > Why don't you just use /solr/select/?q=-HOSTID:302
> >
> 
> Tried the same right at start, but never worked :(

q=-HOSTID:302 and q=+*:* -HOSTID:302 should return same result set. 
Which solr version and query parser are you using?


Re: Problem with query negation

2012-05-18 Thread Ramprakash Ramamoorthy
On Fri, May 18, 2012 at 5:03 PM, Ahmet Arslan  wrote:

> > I am trying the following query and get only zero results (I
> > am supposed to
> > get 10 results according to my dataset)
> >
> > *http://mymachine:8983/solr/select/?q=-(HOSTID:302)*
> >
> > I also tried the below query and got zero results yet
> > again.
> >
> > *http://mymachine:8983/solr/select/?q=NOT(HOSTID:302)*
> >
> > However, I get 10 results(expected) when I put the query
> > this way,
> >
> > *http://mymachine:8983/solr/select/?q=-(HOSTID:302)AND(*:*)*
> >
> > Why is this strange thing happening? Is it a bug in solr or
> > am I missing
> > something?
>
> Why don't you just use /solr/select/?q=-HOSTID:302
>

Tried the same right at start, but never worked :(


-- 
With Thanks and Regards,
Ramprakash Ramamoorthy,
Project Trainee,
Zoho Corporation.
+91 9626975420


Re: Distributed search between solrclouds?

2012-05-18 Thread Darren Govoni
The thought here is to distribute a search between two different
solrcloud clusters and get ordered ranked results between them.
It's possible?

On Tue, 2012-05-15 at 18:47 -0400, Darren Govoni wrote:
> Hi,
>   Would distributed search (the old way where you provide the solr host
> IP's etc.) still work between different solrclouds?
> 
> thanks,
> Darren
> 




SolrCloud deduplication

2012-05-18 Thread Markus Jelsma
Hi,

Deduplication on SolrCloud through the SignatureUpdateRequestProcessor is not 
functional anymore. The problem is that documents are passed multiple times 
through the URP and the digest field is added as if it is an multi valued 
field. 
If the field is not multi valued you'll get this typical error. Changing the 
order or URP's in the chain does not solve the problem.

Any hints on how to resolve the issue? Is this a problem in the 
SignatureUpdateRequestProcessor and does it need to be updated to work with 
SolrCloud? 

Thanks,
Markus


problem in replication

2012-05-18 Thread shinkanze
hi All ,


To provide realtime data we are delta indexing every 15 minutes and than
replicating it to the slave .

*Auto warmup count is 0 . Dismax queries are getting slow really slow (30 TO
90 seconds) and if i stop the delta replication, then the dismax queries are
getting fast . If i run queries on a standalone server they are fast .*

what may be the issue 

Need help on a tight time line 

  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/problem-in-replication-tp3984654.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Problem with query negation

2012-05-18 Thread Ahmet Arslan
> I am trying the following query and get only zero results (I
> am supposed to
> get 10 results according to my dataset)
> 
> *http://mymachine:8983/solr/select/?q=-(HOSTID:302)*
> 
> I also tried the below query and got zero results yet
> again.
> 
> *http://mymachine:8983/solr/select/?q=NOT(HOSTID:302)*
> 
> However, I get 10 results(expected) when I put the query
> this way,
> 
> *http://mymachine:8983/solr/select/?q=-(HOSTID:302)AND(*:*)*
> 
> Why is this strange thing happening? Is it a bug in solr or
> am I missing
> something?

Why don't you just use /solr/select/?q=-HOSTID:302


Re: Use DIH with more than one entity at the same time

2012-05-18 Thread Sergio Martín Cantero
What the wiki indicates actually works, altough it´s not what I wanted. 
I have tried it and works fine.


I have also tried Jack´s approach and also works fine (and is what I was 
looking for :-)


Still, I have one more question. You wrote: " This is a 1.4.1 
installation, back when there was no "threads" option in DIH. ". I´m 
using 3.5 Solr. What would the use of threads change. How could I take 
advantage ot it, instead of declaring various DIHs in SolrConfgi.xml?


Thanks a lot!


El 17/05/12 18:33, Dyer, James escribió:

The wiki here indicates that you can specify "entity" more than once on the 
request and it will run multiple entities at the same time, in the same handler:  
http://wiki.apache.org/solr/DataImportHandler#Commands

But I can't say for sure that this actually works!  Having been in the DIH code, I would 
think such a feature is buggy at best, if it works at all.  But if you try it let us know 
how it works for you.  Also, if anyone else out there is using multiple 
"entity" parameters to get entities running in parallel, I'd be interested in 
hearing about it.

But the approach taken in the link Jack sites below does work.  Its a pain to 
set it up though.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

From: Jack Krupansky [mailto:j...@basetechnology.com]
Sent: Thursday, May 17, 2012 10:21 AM
To: solr-user@lucene.apache.org
Subject: Re: Use DIH with more than one entity at the same time

Okay, the answer is “Yes, sort of, but...”

“One annoyance is because of how DIH is designed, you need a separate handler 
set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in 
advance how many DIH instances you want to run, which config files they'll use, 
etc.”

See:
http://lucene.472066.n3.nabble.com/Multiple-dataimport-processes-to-same-core-td3645525.html

-- Jack Krupansky

From: Sergio Martín Cantero
Sent: Thursday, May 17, 2012 11:07 AM
To: solr-user@lucene.apache.org
Cc: Jack Krupansky
Subject: Re: Use DIH with more than one entity at the same time

Thanks Jack, but that´s not what I want.

I don´t want multiple entities in one invocation, but two simultaneous 
invocations of the DIH with different entities.

Thanks.
[cid:B1C89B4707D142DCB6BFBD6B07E47BC7@JackKrupansky]
[cid:3F3E4BE8DC9D4B808C9038D507DE8415@JackKrupansky]
Sergio Martín Cantero

Office (ES) +34 91 733 73 97

playence Spain SL

sergio.mar...@playence.com

Calle Vicente Gaceo 19

28029 Madrid - España




El 17/05/12 17:04, Jack Krupansky escribió:
Yes. From the doc:

"Multiple 'entity' parameters can be passed on to run multiple entities at once. If 
nothing is passed, all entities are executed."

See:
http://wiki.apache.org/solr/DataImportHandler

But that is one invocation of DIH, not two separate updates as you tried.

-- Jack Krupansky

-Original Message- From: Sergio Martín Cantero
Sent: Thursday, May 17, 2012 10:46 AM
To: solr-user@lucene.apache.org
Subject: Use DIH with more than one entity at the same time

I´m new to this list, so... Hello everybody.

I´m trying to run the DIH with more than one entity at the same time,
but only the first entity I call is being indexed. The other doesn´t get
any response.
For example:
First call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=users
Before the indexing has finished, I call:
http://localhost:8080/solr/dataimport?command=full-import&clean=false&entity=products

The second call doesn´t have any effedt, and the products are not
indexed at all.

Isn´t it possible to run more than one full import for different
entities at the same time?

Thanks a lot for your help
Sergio


Re: Indexing & Searching MySQL table with Hindi and English data

2012-05-18 Thread KP Sanjailal
Thank you for your guidance.  Actually, I am using Apache Solr with jetty
5.1.  I will try to setup solr using tomcat incorporating the configuration
suggested below by you.

Thank you

Sanjailal K P
--

On Thu, May 17, 2012 at 7:24 PM, Ahmet Arslan  wrote:

> > A search with keyword in Hindi retrieve emptly result
> > set.  Also a
> > retrieved hindi record displays junk characters.
>
> Could it be URIEncoding setting of your servlet container?
> http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config
>


Search plain text

2012-05-18 Thread Tolga

Hi,

I have 96 documents added to index, and I would like to be able to 
search in them in plain text, without using complex search queries. How 
can I do that?


Regards,


solrj with incorrect schema

2012-05-18 Thread Jamel ESSOUSSI
Hi;

I have an incorrect schema --> a missing field :

and when I add a documents (UpdateResponse ur = solrServer.add(docs), I have
not be able to catch exception in solrj and the UpdateResponse cannot handle
result in UpdateResponse.

I use solr-core3.6, solr-solrj3.6 and solr.war4.0

Best Regards

– Jamel ESSOUSSI


--
View this message in context: 
http://lucene.472066.n3.nabble.com/solrj-with-incorrect-schema-tp3984615.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding config to SolrCloud without creating any shards/slices

2012-05-18 Thread Per Steffensen
First of all, sorry about the subject of this discussion. It should have 
been something like "Adding config to SolrCloud without starting a Solr 
server"


Mark Miller skrev:

k
On May 16, 2012, at 5:35 AM, Per Steffensen wrote:

  

Hi

We want to create a Solr config in ZK during installation of our product, but 
we dont want to create any shards in that phase. We will create shards from our 
application when it starts up and also automatically maintain the set of shards 
from our application (which uses SolrCloud). The only way we know to create a 
Solr config in ZK is to spin up a  Solr with  system properties zkHost, 
bootstrap_confdir and collection.configName. Is there another, more API-ish, 
way of creating a Solr config in ZK?

Regards, Per Steffensen



I've started some work on this, but I have not finished.

There is a main method in ZkController that has some initial code. Currently it 
just lets you upload a specifically named config set directory - I would also 
like to add the same multi core config set upload option we have on startup - 
where it reads solr.xml, finds all the config dirs and uploads them, and links 
each collection to a config set named after it.
  

Yeah ok, I just want the config created - no shards/slices/collections.

Technically, you could use any tool to set this up - there are a variety of 
options in the zk world - you just have to place the config files under the 
right node.
I would really want to do it through Solr. This is the correct way, I 
think. So that, when you change your "strategy" e.g. location or format 
of configs in ZK, I will automatically inherit that.

 There is one other tricky part though - the collection has to be set to the 
right config set name. This is specified on the collection node in ZooKeeper. 
When creating a new collection, you can specify this as a param. If none is set 
and there is only one config set, that one config set is used. However, some 
link must be made, and it is not done automatically with your initial 
collections in solr.xml unless there is only one config set.
  
I know about that, and will use Solr to create collections. I just want 
the config established in ZK before that, and not create the config 
"during the process of creating a collection".

So now I'm thinking perhaps we should default the config set name to the 
collection name. Then if you simply use the collection name initially when you 
upload the set, no new linking is needed. If you don't like that, you can 
explicitly override what config set to use. Convention would be to name your 
config sets after your collection name, but extra work would allow you to do 
whatever you want.
  
I want several collections to use the same config, so I would have to do 
that extra work.

You can find an example of the ZkContoller main method being used in 
solr/cloud-dev scripts. The one caveat is that we pass an extra param to 
solrhome and briefly run a ZkServer within the ZkController#main method since 
we don't have an external ensemble. Normally this would not make sense and you 
would want to leave that out. I need to clean this all up (the ZkController 
params) and document it on the wiki as soon as I make these couple tweaks 
though.
  

Ok

Thanks, Mark

- Mark Miller
lucidimagination.com
  


Regards, Per Steffensen