Where does Solr load balancer runs?

2014-01-10 Thread Sreehareesh Kaipravan Meethaleveetil
Hi,

I'm a little confused about the Solr replication/load balancing. Where exactly 
the load balancer runs, is it in the Master node (might not I guess) or Slave 
or somewhere else? Please let me know.

Thanks  Regards,
Sreehareesh KM


How to index data in muliValue field with key

2014-01-10 Thread rachun
*This might be very simple question but I can't figure out after i tried to
google all day.

I just want the data to show like this*

/record: [
{
   id: product001
   name: iPhone case,
  title: {
  th: เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case,
  en: iphone5 Case pinky pearl back case
   }
]/

*and this is my schema.xml*

/field name=title type=text_th indexed=true stored=true
multiValued=true/
/

*this is my php code*

?php

require_once( 'SolrPhpClient/Apache/Solr/Service.php' );
$solr = new Apache_Solr_Service( 'localhost', '8983', './solr' 
);

if( !$solr-ping() ) {
  echo Solr service is not responding;
  exit;
}

$parts = array(
'spark_plug' = array(
  'id' = 11,
  'name' = 'Spark plug',
  'title' = array(
'th' = 'เคส sdsdไอโฟน4 iphone4 Case วิบวับ ลายหอไอเฟลสุดเก๋ 
สีชมพูเข้ม
ปปback case',
'en' = 'New design Iphone 4 case with pink and beutiful back 
case '
   ),
  'model' = array( 'a'='Boxster', 'b'='924' ),
  'price' = 25.00,
  'inStock' = true,
),
'windshield' = array(
  'id' = 2,
  'name' = 'Windshield',
  'model' = '911',
  'price' = 15.50,
  'inStock' = false,
  'url'='http://store.weloveshopping.com/joeishiablex12'
)
);

$documents = array();
  
  foreach ( $parts as $item = $fields ) {
 $doc = new Apache_Solr_Document();

foreach ( $fields as $key = $value ) {
  if ( is_array( $value ) ) {
foreach ( $value as $datum ) {
  $doc-setMultiValue( $key, $datum );
}
  }
  else {
$doc-$key = $value;
  }
}

$documents[] = $doc;
  }

try
{
$solr-addDocuments($documents);
$solr-commit();
$solr-optimize();
}
catch(Exeption $e)
{
echo $e-getMessage();
}

?

*but the response that I'm getting now like below as you see it has no key (
th or en) in response*

/record: [
{
   id: product001
   name: iPhone case,
  title: {
   เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case,
   iphone5 Case pinky pearl back case
   }
]/


*Please help, million thanks 
Chun.*



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-index-data-in-muliValue-field-with-key-tp4110653.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searchquery on field that contains space

2014-01-10 Thread PeterKerk
@iorixxx: thanks, you 2nd solution worked.

The first one didn't (does not matter now), I got this:

field name=title type=prefix_full indexed=true stored=true/
field name=title_search type=prefix_full indexed=true stored=true/

With the first solution all queries work as expected, however with this:

q=title_search:new%20yk*

still new york is returned.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Searchquery-on-field-that-contains-space-tp4110166p4110658.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where does Solr load balancer runs?

2014-01-10 Thread Mikhail Khludnev
Hi Sreehareesh,
In master-slave there is no LB inside. You need to provide it externally
and configure to rotate slave endpoints.


On Fri, Jan 10, 2014 at 12:20 PM, Sreehareesh Kaipravan Meethaleveetil 
smeethalevee...@sapient.com wrote:

 Hi,

 I'm a little confused about the Solr replication/load balancing. Where
 exactly the load balancer runs, is it in the Master node (might not I
 guess) or Slave or somewhere else? Please let me know.

 Thanks  Regards,
 Sreehareesh KM




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Sorting of facets

2014-01-10 Thread Markus.Mirsberger

Hi,

is ist possible to sort the facet results by other fields than the facet 
field?


e.g. I have 3 int fields: directory, pages, links

Because I want all unique directories I have to use directory as the 
facet.field parameter.
As far as I understand what I read I can now only sort the facet results 
by the amount of appearances of each directory.


But the TOP directories in my use-case are those with most links and 
most pages on it.


Is there a way to sort my facet results by another field or is it maybe 
possible to have a group of facet fields which can be sorted by each 
field inside the group?



Thanks in advance,
Markus


Boosting documents at index time, based on payloads

2014-01-10 Thread michael.boom
Hi,

I'm not really sure how/if payloads work (I tried out Rafal Kuc's payload
example in Apache Solr 4 Cookbook and did not do what i was expecting - see
below what i was expecting to do and please correct me if i was looking for
the the wrong droid)

What I am trying to achieve is similar to the payload principle, give
certain term a boosting value at index time.
At query time , if searched by that term, that boost value should influence
the scoring, docs with bigger boost values being preferred to the ones with
smaller boost values.

Can this be achieved using payloads? I expect so, but then how should this
behaviour be implemented - the basic recipe failed to work, so I'm a little
confused.

Thanks!



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-documents-at-index-time-based-on-payloads-tp4110661.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to index data in muliValue field with key

2014-01-10 Thread Stefan Matheis
Doesn't work like that - a multivalued field is like a list. PHP doesn't make a 
difference between a list and a map - but Solr does. you can't have a key in 
those fields.  

But based on what infos you've provided .. it looks more like you do in fact 
need different analyzers to get the english vs. the thai text properly. You 
could try title_th and title_en and configure those fields according to your 
needs.

-Stefan  


On Friday, January 10, 2014 at 10:06 AM, rachun wrote:

 *This might be very simple question but I can't figure out after i tried to
 google all day.
  
 I just want the data to show like this*
  
 /record: [
 {
 id: product001
 name: iPhone case,
 title: {
 th: เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case,
 en: iphone5 Case pinky pearl back case
 }
 ]/
  
 *and this is my schema.xml*
  
 /field name=title type=text_th indexed=true stored=true
 multiValued=true/
 /
  
 *this is my php code*
  
 ?php
  
 require_once( 'SolrPhpClient/Apache/Solr/Service.php' );
 $solr = new Apache_Solr_Service( 'localhost', '8983', './solr' );
  
 if( !$solr-ping() ) {
 echo Solr service is not responding;
 exit;
 }
  
 $parts = array(
 'spark_plug' = array(
 'id' = 11,
 'name' = 'Spark plug',
 'title' = array(
 'th' = 'เคส sdsdไอโฟน4 iphone4 Case วิบวับ ลายหอไอเฟลสุดเก๋ สีชมพูเข้ม
 ปปback case',
 'en' = 'New design Iphone 4 case with pink and beutiful back case '
 ),
 'model' = array( 'a'='Boxster', 'b'='924' ),
 'price' = 25.00,
 'inStock' = true,
 ),
 'windshield' = array(
 'id' = 2,
 'name' = 'Windshield',
 'model' = '911',
 'price' = 15.50,
 'inStock' = false,
 'url'='http://store.weloveshopping.com/joeishiablex12'
 )
 );
  
 $documents = array();
  
 foreach ( $parts as $item = $fields ) {
 $doc = new Apache_Solr_Document();
  
 foreach ( $fields as $key = $value ) {
 if ( is_array( $value ) ) {
 foreach ( $value as $datum ) {
 $doc-setMultiValue( $key, $datum );
 }
 }
 else {
 $doc-$key = $value;
 }
 }
  
 $documents[] = $doc;
 }
  
 try
 {
 $solr-addDocuments($documents);
 $solr-commit();
 $solr-optimize();
 }
 catch(Exeption $e)
 {
 echo $e-getMessage();
 }
  
 ?
  
 *but the response that I'm getting now like below as you see it has no key (
 th or en) in response*
  
 /record: [
 {
 id: product001
 name: iPhone case,
 title: {
 เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case,
 iphone5 Case pinky pearl back case
 }
 ]/
  
  
 *Please help, million thanks  
 Chun.*
  
  
  
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-to-index-data-in-muliValue-field-with-key-tp4110653.html
 Sent from the Solr - User mailing list archive at Nabble.com 
 (http://Nabble.com).
  
  




Re: Analysis page broken on trunk?

2014-01-10 Thread Stefan Matheis
Sorry for not getting back on this earlier - i've tried several fields w/ 
values from the example docs and that looks pretty okay to me, no change 
noticed on that.

Can you share a screenshot or something like that? And perhaps Input, 
Fields/Fieldtype which doesn't work for you?

-Stefan 


On Wednesday, January 8, 2014 at 2:24 PM, Markus Jelsma wrote:

 Hi - You will see on the left side each filter abbreviation but you won't see 
 anything in the right container. No terms, positions, offsets, nothing.
 
 Markus
 
 
 -Original message-
  From:Stefan Matheis matheis.ste...@gmail.com 
  (mailto:matheis.ste...@gmail.com)
  Sent: Wednesday 8th January 2014 14:10
  To: solr-user@lucene.apache.org (mailto:solr-user@lucene.apache.org)
  Subject: Re: Analysis page broken on trunk?
  
  Hey Markus
  
  i'm not up to date with the latest changes, but if you can describe how to 
  reproduce it, i can try to verify that?
  
  -Stefan 
  
  
  On Wednesday, January 8, 2014 at 12:44 PM, Markus Jelsma wrote:
  
   Hi - it seems the analysis page is broken on trunk and it looks like our 
   4.5 and 4.6 builds are unaffected. Can anyone on trunk confirm this? 
   Markus
   
  
  
 
 
 




Re: Solr 4.6.0: DocValues (distributed search)

2014-01-10 Thread Manuel Le Normand
In short, when running a distributed search every shard runs the query
separately. Each shard's collector returns the topN (rows param) internal
docId's of the matching documents.

These topN docId's are converted to their uniqueKey in the
BinaryResponseWriter and sent to the frontend core (the one the received
the query). This conversion is implemented by a StoredFieldVisitor, meaning
the uniqueKeys are read from their stored field and not from their
docValues.

As in our use-case we have a high row param, these conversions became a
performance bottleneck. We implemented a user-cache that stores the shard's
uniqueKey docValues, which is a [docId, uniqueKey] mapping. This eliminates
the need of accessing the stored field for these frequent conversions.

You can have a look at the patch. Feel free commenting
https://issues.apache.org/jira/browse/SOLR-5478

Best,
Manuel


On Thu, Jan 9, 2014 at 7:33 PM, ku3ia dem...@gmail.com wrote:

 Today I setup a simple SolrCloud with tow shards. Seems the same. When I'm
 debugging a distributed search I can't catch a break-point at lucene codec
 file, but when I'm using faceted search everything looks fine - debugger
 stops.

 Can anyone help me with my question? Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110511.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Copying Index

2014-01-10 Thread anand chandak

 Folks, any response to below query would be highly appreciated

Thanks,

Anand


On 1/10/2014 11:04 AM, anand chandak wrote:

Hi,


I am testing replication feature of solr 4.x  with large index, 
unfortunately, that index that we had was 3.x format. So to conver 
that into 4.x I copied (file system copy) the index and then ran the 
upgradeindexer utility to convert it to 4.x format. The utility did, 
what it is suppose to do and I had  4.x index (verified it with 
checkindex). However, now when I am replicating, the upgraded index is 
not getting replicated, I don't see any errors in the log file too ? 
If somebody can throw some light on what could be issue here ?


Thanks,

Anand






Re: solr text analysis showing a red bar error

2014-01-10 Thread Erick Erickson
Hmmm, works on a 4x Solr.

Please paste the raw text you're putting in the entry field here, so I
don't have to re-type
it all from the image (can't cut/paste).

What version of Solr are you using?

Anything come out in the Solr log that looks suspicious?

Best,
Erick


On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote:

 Hi,

 I am a new to solr/lucene.
 I am trying to do a text analysis on my index.  The below error
 (screenshot) is shown when I increase the field value length.  I have tried
 searching in vain for any length specific restrictions in solr.TextField.
 There is no error text/exception thrown.

 [image: Inline images 1]

 The field is below
 field name=text type=text_general stored=true indexed=true
  /

 fieldtype is

 fieldType name=text_general class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
   analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.StopFilterFactory ignoreCase=true
 words=stopwords.txt enablePositionIncrements=true /
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
 ignoreCase=true expand=true/
 filter class=solr.LowerCaseFilterFactory/
   /analyzer
 /fieldType


 Any help much appreciated.

 Thanks

 Umapathy



Re: Searchquery on field that contains space

2014-01-10 Thread Erick Erickson
What's the purpose of having two fields title and title_search?
They both are exactly the same so it seems you could get rid of
one

Just a nit.
Erick

As far as the analysis page is concerned, I suspect you took out
this definition from your solrconfig.xml file:

 requestHandler name=/analysis/field
  startup=lazy
  class=solr.FieldAnalysisRequestHandler /

PUT IT BACK ;). Really, this page will save you again and again
and again.

At least when I commented out this definition and tried using the
analysis page I got the same error. You may have taken out other
things in your solrconfig.xml file that are needed for this to work, but
this is the place to start.

Best
Erick

On Fri, Jan 10, 2014 at 4:31 AM, PeterKerk vettepa...@hotmail.com wrote:
 @iorixxx: thanks, you 2nd solution worked.

 The first one didn't (does not matter now), I got this:

 field name=title type=prefix_full indexed=true stored=true/
 field name=title_search type=prefix_full indexed=true stored=true/

 With the first solution all queries work as expected, however with this:

 q=title_search:new%20yk*

 still new york is returned.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Searchquery-on-field-that-contains-space-tp4110166p4110658.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Copying Index

2014-01-10 Thread Erick Erickson
You have to be a bit patient. Folks in the US, at least, are barely awake yet.
We are not a paid-support helpdesk...

On to your problem. You haven't given enough information to say much. for
instance, what do you mean by replication? Old-style master/slave replication?
Do you see any errors in your logs? Or any attempt on the part of the slave
(assuming M/S replication) to replicate? What is the state of your
slave that you
expect it to replicate? How are your slaves configured? Are they pointing to the
master properly?

If you're in SolrCloud mode, then it's a different story.

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com wrote:
  Folks, any response to below query would be highly appreciated

 Thanks,

 Anand



 On 1/10/2014 11:04 AM, anand chandak wrote:

 Hi,


 I am testing replication feature of solr 4.x  with large index,
 unfortunately, that index that we had was 3.x format. So to conver that into
 4.x I copied (file system copy) the index and then ran the upgradeindexer
 utility to convert it to 4.x format. The utility did, what it is suppose to
 do and I had  4.x index (verified it with checkindex). However, now when I
 am replicating, the upgraded index is not getting replicated, I don't see
 any errors in the log file too ? If somebody can throw some light on what
 could be issue here ?

 Thanks,

 Anand





Re: Copying Index

2014-01-10 Thread anand chandak

Erick


My apologies,if I was rushing. Yes its' an old style master/slave 
replication. I was going through solr logs, but don't see any errors as 
such.  The slave are correctly configured and pointing to master 
correctly, one thing I noted is on creating any new document to the 
master, its getting replicated correctly to the slave, but the old index 
that I copied and upgraded to 4.x format, that's not getting replicated, 
I also ran the checkIndex -f utility, don't see any issue there too


Thanks,

Anand


On 1/10/2014 6:30 PM, Erick Erickson wrote:

You have to be a bit patient. Folks in the US, at least, are barely awake yet.
We are not a paid-support helpdesk...

On to your problem. You haven't given enough information to say much. for
instance, what do you mean by replication? Old-style master/slave replication?
Do you see any errors in your logs? Or any attempt on the part of the slave
(assuming M/S replication) to replicate? What is the state of your
slave that you
expect it to replicate? How are your slaves configured? Are they pointing to the
master properly?

If you're in SolrCloud mode, then it's a different story.

You might review:
http://wiki.apache.org/solr/UsingMailingLists

Best,
Erick

On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com wrote:

  Folks, any response to below query would be highly appreciated

Thanks,

Anand



On 1/10/2014 11:04 AM, anand chandak wrote:

Hi,


I am testing replication feature of solr 4.x  with large index,
unfortunately, that index that we had was 3.x format. So to conver that into
4.x I copied (file system copy) the index and then ran the upgradeindexer
utility to convert it to 4.x format. The utility did, what it is suppose to
do and I had  4.x index (verified it with checkindex). However, now when I
am replicating, the upgraded index is not getting replicated, I don't see
any errors in the log file too ? If somebody can throw some light on what
could be issue here ?

Thanks,

Anand






leading wildcard characters

2014-01-10 Thread Peter Keegan
How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard
method is there in the parser, but nothing references the getter. Also, the
Edismax parser always enables it and provides no way to override.

Thanks,
Peter


Re: Solr 4.6.0: DocValues (distributed search)

2014-01-10 Thread ku3ia
Manuel Le Normand wrote
 In short, when running a distributed search every shard runs the query
 separately. Each shard's collector returns the topN (rows param) internal
 docId's of the matching documents.
 
 These topN docId's are converted to their uniqueKey in the
 BinaryResponseWriter and sent to the frontend core (the one the received
 the query). This conversion is implemented by a StoredFieldVisitor,
 meaning
 the uniqueKeys are read from their stored field and not from their
 docValues.
 
 As in our use-case we have a high row param, these conversions became a
 performance bottleneck. We implemented a user-cache that stores the
 shard's
 uniqueKey docValues, which is a [docId, uniqueKey] mapping. This
 eliminates
 the need of accessing the stored field for these frequent conversions.
 
 You can have a look at the patch. Feel free commenting
 https://issues.apache.org/jira/browse/SOLR-5478
 
 Best,
 Manuel
 
 
 On Thu, Jan 9, 2014 at 7:33 PM, ku3ia lt;

 demesg@

 gt; wrote:
 
 Today I setup a simple SolrCloud with tow shards. Seems the same. When
 I'm
 debugging a distributed search I can't catch a break-point at lucene
 codec
 file, but when I'm using faceted search everything looks fine - debugger
 stops.

 Can anyone help me with my question? Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110511.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Hi, Manuel! Many thanks for your post! I'll try yours patch.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110698.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Copying Index

2014-01-10 Thread Erick Erickson
OK, probably this is what's happening: There's no event that causes
the slave to say Oh, my index is out of date. This assumes (and I
haven't checked) that you have the same number of segments etc.
after the upgrade to 4x format.

So when you update a doc, that registers a change event that the
slave recognizes as a changed index and pulls the doc down as part
of a new segment.

So I posit that eventually the entire index will be replicated in the new
format as the segments get merged.

You can probably force this by doing an optimize on the master.

WARNING: This is theoretical, I'm not saying this from deep understanding
of the replication code, but it seems like a good story.

Best,
Erick

On Fri, Jan 10, 2014 at 8:09 AM, anand chandak anand.chan...@oracle.com wrote:
 Erick


 My apologies,if I was rushing. Yes its' an old style master/slave
 replication. I was going through solr logs, but don't see any errors as
 such.  The slave are correctly configured and pointing to master correctly,
 one thing I noted is on creating any new document to the master, its getting
 replicated correctly to the slave, but the old index that I copied and
 upgraded to 4.x format, that's not getting replicated, I also ran the
 checkIndex -f utility, don't see any issue there too

 Thanks,

 Anand



 On 1/10/2014 6:30 PM, Erick Erickson wrote:

 You have to be a bit patient. Folks in the US, at least, are barely awake
 yet.
 We are not a paid-support helpdesk...

 On to your problem. You haven't given enough information to say much. for
 instance, what do you mean by replication? Old-style master/slave
 replication?
 Do you see any errors in your logs? Or any attempt on the part of the
 slave
 (assuming M/S replication) to replicate? What is the state of your
 slave that you
 expect it to replicate? How are your slaves configured? Are they pointing
 to the
 master properly?

 If you're in SolrCloud mode, then it's a different story.

 You might review:
 http://wiki.apache.org/solr/UsingMailingLists

 Best,
 Erick

 On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com
 wrote:

   Folks, any response to below query would be highly appreciated

 Thanks,

 Anand



 On 1/10/2014 11:04 AM, anand chandak wrote:

 Hi,


 I am testing replication feature of solr 4.x  with large index,
 unfortunately, that index that we had was 3.x format. So to conver that
 into
 4.x I copied (file system copy) the index and then ran the
 upgradeindexer
 utility to convert it to 4.x format. The utility did, what it is suppose
 to
 do and I had  4.x index (verified it with checkindex). However, now when
 I
 am replicating, the upgraded index is not getting replicated, I don't
 see
 any errors in the log file too ? If somebody can throw some light on
 what
 could be issue here ?

 Thanks,

 Anand





Re: solr text analysis showing a red bar error

2014-01-10 Thread Umapathy S
I think it's the HTTP GET parameter length/size issue.  I got to the
maximum characters it allowed through Field value (Index).  But when I
added characters on Field value (Query), I got the red bar again.  I had to
reduce the characters in Field value (Index) to make it work.
I was using chrome so possibly hit that 2KB GET limit.

No.  The request never reached the solr service (it was running localhost).

Thanks.


On 10 January 2014 12:38, Erick Erickson erickerick...@gmail.com wrote:

 Hmmm, works on a 4x Solr.

 Please paste the raw text you're putting in the entry field here, so I
 don't have to re-type
 it all from the image (can't cut/paste).

 What version of Solr are you using?

 Anything come out in the Solr log that looks suspicious?

 Best,
 Erick


 On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote:

  Hi,
 
  I am a new to solr/lucene.
  I am trying to do a text analysis on my index.  The below error
  (screenshot) is shown when I increase the field value length.  I have
 tried
  searching in vain for any length specific restrictions in solr.TextField.
  There is no error text/exception thrown.
 
  [image: Inline images 1]
 
  The field is below
  field name=text type=text_general stored=true indexed=true
   /
 
  fieldtype is
 
  fieldType name=text_general class=solr.TextField
  positionIncrementGap=100
analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true /
  filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true /
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.LowerCaseFilterFactory/
/analyzer
  /fieldType
 
 
  Any help much appreciated.
 
  Thanks
 
  Umapathy
 



Re: solr text analysis showing a red bar error

2014-01-10 Thread Erick Erickson
Ah, OK. The analysis page in the admin screen is not really
intended to analyze large text blocks. I suspect that if you're
running into size limitations, you'll find the output pretty hard
to read the output. I almost always use it with pretty short
text fragments, usually just a few words.

FWIW,
Erick

On Fri, Jan 10, 2014 at 9:50 AM, Umapathy S nsupat...@gmail.com wrote:
 I think it's the HTTP GET parameter length/size issue.  I got to the
 maximum characters it allowed through Field value (Index).  But when I
 added characters on Field value (Query), I got the red bar again.  I had to
 reduce the characters in Field value (Index) to make it work.
 I was using chrome so possibly hit that 2KB GET limit.

 No.  The request never reached the solr service (it was running localhost).

 Thanks.


 On 10 January 2014 12:38, Erick Erickson erickerick...@gmail.com wrote:

 Hmmm, works on a 4x Solr.

 Please paste the raw text you're putting in the entry field here, so I
 don't have to re-type
 it all from the image (can't cut/paste).

 What version of Solr are you using?

 Anything come out in the Solr log that looks suspicious?

 Best,
 Erick


 On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote:

  Hi,
 
  I am a new to solr/lucene.
  I am trying to do a text analysis on my index.  The below error
  (screenshot) is shown when I increase the field value length.  I have
 tried
  searching in vain for any length specific restrictions in solr.TextField.
  There is no error text/exception thrown.
 
  [image: Inline images 1]
 
  The field is below
  field name=text type=text_general stored=true indexed=true
   /
 
  fieldtype is
 
  fieldType name=text_general class=solr.TextField
  positionIncrementGap=100
analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true /
  filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true /
  filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=true/
  filter class=solr.LowerCaseFilterFactory/
/analyzer
  /fieldType
 
 
  Any help much appreciated.
 
  Thanks
 
  Umapathy
 



Re: leading wildcard characters

2014-01-10 Thread Ahmet Arslan
Hi Peter,

Can you remove any occurrence of ReversedWildcardFilterFactory in schema.xml? 
(even if you don't use it)

Ahmet



On Friday, January 10, 2014 3:34 PM, Peter Keegan peterlkee...@gmail.com 
wrote:
How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard
method is there in the parser, but nothing references the getter. Also, the
Edismax parser always enables it and provides no way to override.

Thanks,
Peter



Re: DateField - Invalid JSON String Exception - converting Query Response to JSON Object

2014-01-10 Thread Chris Hostetter

: Response:
: {responseHeader={status=0,QTime=0,params={lowercaseOperators=true,sort=score
: 
desc,cache=false,qf=content,wt=javabin,rows=100,defType=edismax,version=2,fl=*,score,start=0,q=White+Paper,stopwords=true,fq=type:White
: 
Paper}},response={numFound=9,start=0,maxScore=0.61586785,docs=[SolrDocument{id=007,
: type=White Paper, source=Documents, title=White Paper 003, body=White Paper
: 004 Body, author=[Author 3], keywords=[Keyword 3], description=Vivamus
: turpis eros, mime_type=pdf, _version_=1456609602022932480,
: *publication_date=Wed
: Jan 08 03:16:06 IST 2014*, score=0.61586785}]},

You are not looking at JSON data -- you are looking at a simple toString 
value from the QueryResponse java object.  It's not intended to be 
used for anything beyond debugging.

if you want the raw JSON data from Solr, then you should either *not* use 
SolrJ (most of that code is for parsing hte response into Java objects and 
you aparently don't want that) ... or: you should specify your own 
ResponseParser that will give you access to the raw stream of JSON...

  class YourRawResponseParser extends ResponseParser {
// ...
processResponse(InputStream body, String encoding) {
  // ...
  // do some JSON processing of body
  // ...
}
  }

But this assumes you want the raw JSON values returned by Solr -- 
previously you mentioned that you were trying to *create* JSON using the 
data returned by Solr using a JSON generating library -- in which case you 
may in fact waht to use Solr's binary response format, get the structured 
java object response, and then walk it pulling out just the pieces of data 
fro mthe response you want, and pass those specific values to your JSON 
generation library.

It's hard to tell, because you haven't really elaborated on what it is you 
are trying to do -- all you've made clear is that that you are getting 
invalid JSON from a method that has never ever been ment to return JSON...

https://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an XY Problem ... that is: you are dealing
with X, you are assuming Y will help you, and you are asking about Y
without giving more details about the X so that we can understand the
full issue.  Perhaps the best solution doesn't involve Y at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341




-Hoss
http://www.lucidworks.com/


Re: leading wildcard characters

2014-01-10 Thread Peter Keegan
Removing ReversedWildcardFilterFactory  had no effect.


On Fri, Jan 10, 2014 at 10:48 AM, Ahmet Arslan iori...@yahoo.com wrote:

 Hi Peter,

 Can you remove any occurrence of ReversedWildcardFilterFactory in
 schema.xml? (even if you don't use it)

 Ahmet



 On Friday, January 10, 2014 3:34 PM, Peter Keegan peterlkee...@gmail.com
 wrote:
 How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard
 method is there in the parser, but nothing references the getter. Also, the
 Edismax parser always enables it and provides no way to override.

 Thanks,
 Peter




Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Chris Hostetter

: The problem manifests as this sort of thing:
: 
: Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log
: SEVERE: java.lang.IllegalArgumentException: startOffset must be
: non-negative, and endOffset must be = startOffset,
: startOffset=-1811581632,endOffset=-1811581632

Is there a stack trace in the log to go along with that?  there should be.

My suspicion is that since analysis errors like these are 
RuntimeExceptions, they may not be getting caught  re-thrown with as much 
context as they should -- so by the time they get logged (or returned to 
the client) there isn't any info about the problematic field value, let 
alone the unqiueKey.

If we had a test case that reproduces (ie: with a mock tokenfilter that 
always throws a RuntimeException when a token matches fail_now or 
something) we could have some tests that assert indexing a doc with that 
token results in a useful error -- which should help ensure that useful 
error also gets logged (although i don't think we don't really have any 
easy way of asserting specific log messages at the moment)


-Hoss
http://www.lucidworks.com/


Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
OK, patch forthcoming.

On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : The problem manifests as this sort of thing:
 :
 : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log
 : SEVERE: java.lang.IllegalArgumentException: startOffset must be
 : non-negative, and endOffset must be = startOffset,
 : startOffset=-1811581632,endOffset=-1811581632

 Is there a stack trace in the log to go along with that?  there should be.

 My suspicion is that since analysis errors like these are
 RuntimeExceptions, they may not be getting caught  re-thrown with as much
 context as they should -- so by the time they get logged (or returned to
 the client) there isn't any info about the problematic field value, let
 alone the unqiueKey.

 If we had a test case that reproduces (ie: with a mock tokenfilter that
 always throws a RuntimeException when a token matches fail_now or
 something) we could have some tests that assert indexing a doc with that
 token results in a useful error -- which should help ensure that useful
 error also gets logged (although i don't think we don't really have any
 easy way of asserting specific log messages at the moment)


 -Hoss
 http://www.lucidworks.com/


Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
Is there a neighborhood of existing tests I should be visiting here?


On Fri, Jan 10, 2014 at 11:27 AM, Benson Margulies
bimargul...@gmail.com wrote:
 OK, patch forthcoming.

 On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter
 hossman_luc...@fucit.org wrote:

 : The problem manifests as this sort of thing:
 :
 : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log
 : SEVERE: java.lang.IllegalArgumentException: startOffset must be
 : non-negative, and endOffset must be = startOffset,
 : startOffset=-1811581632,endOffset=-1811581632

 Is there a stack trace in the log to go along with that?  there should be.

 My suspicion is that since analysis errors like these are
 RuntimeExceptions, they may not be getting caught  re-thrown with as much
 context as they should -- so by the time they get logged (or returned to
 the client) there isn't any info about the problematic field value, let
 alone the unqiueKey.

 If we had a test case that reproduces (ie: with a mock tokenfilter that
 always throws a RuntimeException when a token matches fail_now or
 something) we could have some tests that assert indexing a doc with that
 token results in a useful error -- which should help ensure that useful
 error also gets logged (although i don't think we don't really have any
 easy way of asserting specific log messages at the moment)


 -Hoss
 http://www.lucidworks.com/


Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Chris Hostetter

: Is there a neighborhood of existing tests I should be visiting here?

You'll need a custom schema that refers to your new 
MockFailOnCertainTokensFilterFactory, so i would create a completley new 
test class somewhere in ...solr.update (you're testing that an update 
fails with a clean error)


-Hoss
http://www.lucidworks.com/


Indexing spatial fields into SolrCloud (HTTP)

2014-01-10 Thread Beale, Jim (US-KOP)
I am porting an application from Lucene to Solr which makes use of spatial4j 
for distance searches.  The Lucene version works correctly but I am having a 
problem getting the Solr version to work in the same way.

Lucene version:

SpatialContext geoSpatialCtx = SpatialContext.GEO;

   geoSpatialStrategy = new RecursivePrefixTreeStrategy(new 
GeohashPrefixTree(
 geoSpatialCtx, GeohashPrefixTree.getMaxLevelsPossible()), 
DocumentFieldNames.LOCATION);


   Point point = geoSpatialCtx.makePoint(lon, lat);
   for (IndexableField field : 
geoSpatialStrategy.createIndexableFields(point)) {
  document.add(field);
   }

   //Store the field
   document.add(new StoredField(geoSpatialStrategy.getFieldName(), 
geoSpatialCtx.toString(point)));

Solr version:

   Point point = geoSpatialCtx.makePoint(lon, lat);
   for (IndexableField field : 
geoSpatialStrategy.createIndexableFields(point)) {
  try {
 solrDocument.addField(field.name(), 
field.tokenStream(analyzer));
  } catch (IOException e) {
 LOGGER.error(Failed to add geo field to Solr index, e);
  }
   }

   // Store the field
   solrDocument.addField(geoSpatialStrategy.getFieldName(), 
geoSpatialCtx.toString(point));

The server-side error is as follows:

Caused by: com.spatial4j.core.exception.InvalidShapeException: Unable to read: 
org.apache.lucene.spatial.prefix.PrefixTreeStrategy$CellTokenStr\
eam@0
at 
com.spatial4j.core.io.ShapeReadWriter.readShape(ShapeReadWriter.java:48)
at 
com.spatial4j.core.context.SpatialContext.readShape(SpatialContext.java:195)
at 
org.apache.solr.schema.AbstractSpatialFieldType.parseShape(AbstractSpatialFieldType.java:142)

I've seen David Smiley's sample code, specifically the class, 
SpatialDemoUpdateProcessorFactory, but I can't say that I was able to benefit 
from it at all.

What I'm trying to do seems like it should be easy - just to index a point for 
distance searching - but I'm obviously missing something.

Any ideas?
Thanks,
Jim

The information contained in this email message, including any attachments, is 
intended solely for use by the individual or entity named above and may be 
confidential. If the reader of this message is not the intended recipient, you 
are hereby notified that you must not read, use, disclose, distribute or copy 
any part of this communication. If you have received this communication in 
error, please immediately notify me by email and destroy the original message, 
including any attachments. Thank you.


Re: Tracking down the input that hits an analysis chain bug

2014-01-10 Thread Benson Margulies
Thanks, that's the recipe that I need.

On Fri, Jan 10, 2014 at 11:40 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : Is there a neighborhood of existing tests I should be visiting here?

 You'll need a custom schema that refers to your new
 MockFailOnCertainTokensFilterFactory, so i would create a completley new
 test class somewhere in ...solr.update (you're testing that an update
 fails with a clean error)


 -Hoss
 http://www.lucidworks.com/


CSVResponseWriter and grouped results not working

2014-01-10 Thread Matt Kleweno
With Solr 4.3.1 - it appears the CSVResponseWriter does not return any
results if group=true.  Is that correct or am I doing something wrong?  I
get results when not grouping.

I wanted to verify before posting a feature request.


Re: CSVResponseWriter and grouped results not working

2014-01-10 Thread Lianyi Han
Same here with Solr 4.4.0, no result when group=truewt=csv


-lianyi
Less isn't more; just enough is more. -Milton Glaser


On Fri, Jan 10, 2014 at 2:37 PM, Matt Kleweno matt.klew...@gmail.com wrote:
 With Solr 4.3.1 - it appears the CSVResponseWriter does not return any
 results if group=true.  Is that correct or am I doing something wrong?  I
 get results when not grouping.

 I wanted to verify before posting a feature request.


Re: Perl Client for SolrCloud

2014-01-10 Thread Tim Vaillancourt
I'm pretty interested in taking a stab at a Perl CPAN for SolrCloud that 
is Zookeeper-aware; it's the least I can do for Solr as a non-Java 
developer. :)


A quick question though: how would I write the shard logic to behave 
similar to Java's Zookeeper-aware client? I'm able to get the hash/hex 
needed for each shard from clusterstate.json, but how do I know which 
field to hash on?


I'm guessing I also need to read the collection's schema.xml from 
Zookeeper to get uniqueKey, and then use that for sharding, or does the 
Java client take the sharding field as input? Looking for ideas here.


Thanks!

Tim

On 08/01/14 09:35 AM, Chris Hostetter wrote:

:  I couldn't find anyone which can connect to SolrCloud similar to SolrJ's
:  CloudSolrServer.
:
: Since I have a load balancer in front of 8 nodes, WebService::Solr[1] still
: works fine.

Right -- just because SolrJ is ZooKeeper aware doesn't mean you can *only*
talk to SolrCloud with SolrJ -- you can still use any HTTP client of your
choice to connect to your Solr nodes in a round robin fashion (or via a
load blancer) if you wish -- just like with a non SolrCloud deployment
using something like master/slave.

What you might want to consider, is taking a look at something like
Net::ZooKeeper to have a ZK aware perl client layer that could wrap
WebService::Solr.


-Hoss
http://www.lucidworks.com/


Re: Perl Client for SolrCloud

2014-01-10 Thread Chris Hostetter

: A quick question though: how would I write the shard logic to behave similar
: to Java's Zookeeper-aware client? I'm able to get the hash/hex needed for each
: shard from clusterstate.json, but how do I know which field to hash on?

The logic you're asking about is encapsulated in the DocRouter (which can 
be customized per collection).  I'm not sure how the CloudSolrServer SolrJ 
client current deals with knowing which DocRouter to use, but for a 
non-Java langauge that can't directly load the same classes a great first 
step would be...

 * be conigurable solely with a list of ZK addresses
 * connect to ZK and per collection be continuously aware of:
   - the list of all live nodes as they go up/down
   - the list of leaders as shard elections happen
 * for queries, route to a random live node
 * for updates, route to any live leader

the most important part being the first 2 bullets. the last bullet being 
an optimization over just sending to a random node because you increase 
the odds of hitting the correct leader for the doc in question regardless 
of which DocRouter is in use. 


-Hoss
http://www.lucidworks.com/