Re: Exception when optimizing index

2012-06-18 Thread Rok Rejc
Hi all,

during the last days, I have create solr instance on a windows environment
- same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
solr configurations, Tomcat 6, Java 6u23.
I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).

Import and optimize on the windows machine worked without any issue, but on
the linux machine optimize fails with the same exception:

Caused by: java.io.IOException: Invalid vInt detected (too many bits)
at
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
...

after that I have also change directory factory (on the linux machine) to
SimpleFSDirectoryFactory. I have reindexed all the documents and again run
the optimize - it fails again with the same expcetion.

In the next steps I could maybe do partial insertions (which will be a
painful process), but after that I'm out of ideas (and out of time for
experimenting).

Many thanks for further suggestions.

Rok



On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir rcm...@gmail.com wrote:

 On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc rokrej...@gmail.com wrote:
- java.runtime.nameOpenJDK Runtime Environment
- java.runtime.version1.6.0_22-b22
 ...
 
  As far as I see from the JIRA issue I have the patch attached (as
 mentioned
  I have a trunk version from May 12). Any ideas?
 

 its not guaranteed that the patch will workaround all hotspot bugs
 related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921

 Since you can reproduce, is it possible for you to re-test the
 scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?

 --
 lucidimagination.com



delete by query don't work

2012-06-18 Thread ramzesua
Hi all. I am using solr 4.0 and trying to clear index by query. At first I
use deletequery*:*/query/delete with commit, but index is still not
empty. I tried another queries, but it not help me. Then I tried delete by
`id`. It works fine, but I need clear all index. Can anyone help me?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception when optimizing index

2012-06-18 Thread Erick Erickson
Is it possible that you somehow have some problem with jars and classpath?
I'm wondering because this problem really seems odd, and you've eliminated
a bunch of possibilities. I'm wondering if you've somehow gotten some old
jars mixed in the bunch.

Or, alternately, what about re-installing Solr on the theory that somehow you
got a bad download and/or files (i.e. the Solr jar files) got
corrupted, your disk has
a bad spot or.

Really clutching at straws here

Erick

On Mon, Jun 18, 2012 at 3:44 AM, Rok Rejc rokrej...@gmail.com wrote:
 Hi all,

 during the last days, I have create solr instance on a windows environment
 - same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
 solr configurations, Tomcat 6, Java 6u23.
 I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).

 Import and optimize on the windows machine worked without any issue, but on
 the linux machine optimize fails with the same exception:

 Caused by: java.io.IOException: Invalid vInt detected (too many bits)
    at
 org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
 ...

 after that I have also change directory factory (on the linux machine) to
 SimpleFSDirectoryFactory. I have reindexed all the documents and again run
 the optimize - it fails again with the same expcetion.

 In the next steps I could maybe do partial insertions (which will be a
 painful process), but after that I'm out of ideas (and out of time for
 experimenting).

 Many thanks for further suggestions.

 Rok



 On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir rcm...@gmail.com wrote:

 On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc rokrej...@gmail.com wrote:
    - java.runtime.nameOpenJDK Runtime Environment
    - java.runtime.version1.6.0_22-b22
 ...
 
  As far as I see from the JIRA issue I have the patch attached (as
 mentioned
  I have a trunk version from May 12). Any ideas?
 

 its not guaranteed that the patch will workaround all hotspot bugs
 related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921

 Since you can reproduce, is it possible for you to re-test the
 scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?

 --
 lucidimagination.com



Re: delete by query don't work

2012-06-18 Thread Erick Erickson
Well, it would help if you defined what behavior you're seeing. When you
say delete-by-query doesn't work, what is the symptom? What does empty
mean? Because if you're just looking at your index directory and expecting
to see files disappear, you'll be disappointed.

When you delete documents in Solr, the docs are just marked as deleted, they
aren't physically removed until segments are merged. Does a query for *:* return
any documents after you delete-by-query?

Running an optimize after you do the delete will force merging to happen BTW.

If this doesn't help, please post the exact URLs you use, and what
your evidence that
the index isn't empty is.

Best
Erick

On Mon, Jun 18, 2012 at 5:45 AM, ramzesua michaelnaza...@gmail.com wrote:
 Hi all. I am using solr 4.0 and trying to clear index by query. At first I
 use deletequery*:*/query/delete with commit, but index is still not
 empty. I tried another queries, but it not help me. Then I tried delete by
 `id`. It works fine, but I need clear all index. Can anyone help me?


 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception when optimizing index

2012-06-18 Thread Michael McCandless
Is it possible the Linux machine has bad RAM / bad disk?

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 18, 2012 at 7:06 AM, Erick Erickson erickerick...@gmail.com wrote:
 Is it possible that you somehow have some problem with jars and classpath?
 I'm wondering because this problem really seems odd, and you've eliminated
 a bunch of possibilities. I'm wondering if you've somehow gotten some old
 jars mixed in the bunch.

 Or, alternately, what about re-installing Solr on the theory that somehow you
 got a bad download and/or files (i.e. the Solr jar files) got
 corrupted, your disk has
 a bad spot or.

 Really clutching at straws here

 Erick

 On Mon, Jun 18, 2012 at 3:44 AM, Rok Rejc rokrej...@gmail.com wrote:
 Hi all,

 during the last days, I have create solr instance on a windows environment
 - same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
 solr configurations, Tomcat 6, Java 6u23.
 I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).

 Import and optimize on the windows machine worked without any issue, but on
 the linux machine optimize fails with the same exception:

 Caused by: java.io.IOException: Invalid vInt detected (too many bits)
    at
 org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
 ...

 after that I have also change directory factory (on the linux machine) to
 SimpleFSDirectoryFactory. I have reindexed all the documents and again run
 the optimize - it fails again with the same expcetion.

 In the next steps I could maybe do partial insertions (which will be a
 painful process), but after that I'm out of ideas (and out of time for
 experimenting).

 Many thanks for further suggestions.

 Rok



 On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir rcm...@gmail.com wrote:

 On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc rokrej...@gmail.com wrote:
    - java.runtime.nameOpenJDK Runtime Environment
    - java.runtime.version1.6.0_22-b22
 ...
 
  As far as I see from the JIRA issue I have the patch attached (as
 mentioned
  I have a trunk version from May 12). Any ideas?
 

 its not guaranteed that the patch will workaround all hotspot bugs
 related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921

 Since you can reproduce, is it possible for you to re-test the
 scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?

 --
 lucidimagination.com



SolrCloud non-distributed indexing (update.chain)

2012-06-18 Thread Boon Low
Hi,

What's happening to the update.chain of SolrCloud?

I am running SolrCloud (compiled from trunk today) with an update.chain 
pointing to an updateRequestProcessorChain in solrconfig which omits the 
DistributedUpdateProcessorFactory, so that indexing can be done on specific 
shards (not distributed).

This works previously but not in the recent builts (e.g. since 6th June..). I 
noticed the additional update parameters such as update.distrib being logged 
across the cloud nodes:

...update.distrib=TOLEADER update.chain=notdistributed

I tried update.distrib=NONE. The indexing still being distributed and ignoring 
the update.chain (as specified below in Solr config).

updateRequestProcessorChain name=notdistributed
processor class=solr.LogUpdateProcessorFactory /
processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain 

How do I get the above chain and non-distributed indexing to work again?

Regards,

Boon

-
Boon Low
Search UX and Engine Developer (SOLR)
brightsolid Online Publishing


__
brightsolid is used in this email to collectively mean brightsolid online 
innovation limited and its subsidiary companies brightsolid online publishing 
limited and brightsolid online technology limited.
findmypast.co.uk is a brand of brightsolid online publishing limited.
brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
Street, London EC2A 3DQ. Registered in England No. 04369607.
brightsolid online technology limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.

Email Disclaimer

This message is confidential and may contain privileged information. You should 
not disclose its contents to any other person. If you are not the intended 
recipient, please notify the sender named above immediately. It is expressly 
declared that this e-mail does not constitute nor form part of a contract or 
unilateral obligation. Opinions, conclusions and other information in this 
message that do not relate to the official business of brightsolid shall be 
understood as neither given nor endorsed by it.
__
This email has been scanned by the brightsolid Email Security System. Powered 
by MessageLabs
__


Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah
Am Freitag, den 15.06.2012, 18:22 +0100 schrieb Kissue Kissue:
 Hi,
 
 Does anybody know what the default connection timeout setting is for
 StreamingUpdateSolrServer? Can i explicitly set one and how?
 
 Thanks. 

Use a custom HttpClient to set one (only snippets, should be clear, if
not tell):

this.instance = new StreamingUpdateSolrServer(getUrl(), httpClient,
DOC_QUEUE_SIZE, WORKER_SIZE);

and use httpClient like this:

this.connectionManager = new MultiThreadedHttpConnectionManager();
final HttpClient httpClient = new HttpClient(this.connectionManager);
httpClient.getParams().setConnectionManagerTimeout(CONN_ACQUIRE_TIMEOUT);
httpClient.getParams().setSoTimeout(SO_TIMEOUT);

regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature


Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah
AddOn: You can even set a custom http factory for commons-http (which is
used by SolrStreamingUpdateServer) at all to influence socket options,
example is:

final Protocol http = new Protocol(http,
MycustomHttpSocketFactory.getSocketFactory(), 80);

and MycustomHttpSocketFactory.getSocketFactory is a factory which does
extend

org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory

and override / implement methods as needed (direct socket access).

Call this e.g. in a ServletListener in contextInitialized and you are
done.

regards

Torsten



smime.p7s
Description: S/MIME cryptographic signature


Re: SolrCloud and split-brain

2012-06-18 Thread Sami Siren
On Sat, Jun 16, 2012 at 5:33 AM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:

 And here is one more Q.
 * Imagine a client is adding documents and, for simplicity, imagine SolrCloud 
 routes all these documents to the same shard, call it S.
 * Imagine that both the 7-node and the 3-node partition end up with a 
 complete index and thus both accept updates.

According to comments from Mark this (having two functioning sides)
is not possible... only one side _can_ continue functioning (taking in
updates). Depending on how the shards are deployed over the nodes a
side may still not accept updates (even if that side has a working
zk setup).

 Now imagine if the client sending documents for indexing happened to be 
 sending documents to 2 nodes, say in round-robin fashion.

In my understanding all updates are routed through a shard leader.

--
 Sami Siren


Re: delete by query don't work

2012-06-18 Thread Toke Eskildsen
On Mon, 2012-06-18 at 11:45 +0200, ramzesua wrote:
 Hi all. I am using solr 4.0 and trying to clear index by query. At first I
 use deletequery*:*/query/delete with commit, but index is still not
 empty. I tried another queries, but it not help me. Then I tried delete by
 `id`. It works fine, but I need clear all index. Can anyone help me?

It's a subtle bug/problem in the default schema. Fortunately it is
easily fixable. See https://issues.apache.org/jira/browse/SOLR-3432



Re: Lock error when indexing with curl

2012-06-18 Thread Heike Grimm
harun sahiner harunsahiner at gmail.com writes:

 
 Hi, 
 
 i have a similar lock error. Did you find any solution ? 
 
 --
 View this message in context:
http://lucene.472066.n3.nabble.com/Lock-error-when-indexing-with-curl-tp480958p3403119.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 
 


Hi there!

I had the same problem. It seems that curl did not accept root (I tried under
Debian) as a user that may index files in that folder. Changing the rights of
the folder or adding root as user in curl will help :)

Greetings
Heike



RE: How to update one field without losing the others?

2012-06-18 Thread Kai Gülzau
I'm currently playing around with a branch 4x Version 
(https://builds.apache.org/job/Solr-4.x/5/) but I don't get field updates to 
work.

A simple GET testrequest
http://localhost:8983/solr/master/update/json?stream.body={add:{doc:{ukey:08154711,type:1,nbody:{set:mycontent

results in
{
  ukey:08154711,
  type:1,
  nbody:{set=mycontent}}]
}

All fields are stored.
ukey is the unique key :-)
type is a required field.
nbody is a solr.TextField.


Is there any (wiki/readme) pointer how to test and use these feature correctly?
What are the restrictions?

Regards,

Kai Gülzau

 
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Saturday, June 16, 2012 4:47 PM
To: solr-user@lucene.apache.org
Subject: Re: How to update one field without losing the others?

Atomic update is a very new feature coming in 4.0 (i.e. grab a recent
nightly build to try it out).

It's not documented yet, but here's the JIRA issue:
https://issues.apache.org/jira/browse/SOLR-139?focusedCommentId=13269007page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269007

-Yonik
http://lucidimagination.com


Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah
You should also call the glue code ;-):

Protocol.registerProtocol(http, http);

regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature


Re: SolrCloud and split-brain

2012-06-18 Thread Mark Miller

On Jun 15, 2012, at 10:33 PM, Otis Gospodnetic wrote:

 However, if my half brain understands what split brain is then I think that's 
 not a completely true claim because one can get unlucky and get a SolrCloud 
 cluster partitioned in a way that one or even all partitions reject indexing 
 (and update and deletion) requests if they do not have a complete index.

That's not split brain. Split brain means that multiple partitioned clusters 
think they are *the* cluster and would keep accepting updates. This is a real 
problem because when you unsplit the cluster, you cannot reconcile conflicting 
updates easily! In many cases you have to ask the user to resolve the conflict.

Yes, you must have a node to serve a shard in order to index to that shard. You 
do not need the whole index - but if an update hashes to a shard that has no 
nodes hosting it, it will fail. If there is no node, the document has no where 
to live. Some systems do interesting things like buffer those updates to other 
nodes for a while - we don't plan on anything like that soon. At some point, 
you can only survive a loss of so many nodes before its time to give up 
accepting updates in any system. If you need to survive catastrophic loss of 
nodes, you have to have enough replicas to handle it. Whether those nodes are 
partitioned off from the cluster or simply die, it's all the same. You can only 
survive so many node loses, and replicas are your defense.

The lack of split-brain allows your cluster to remain consistent. If you allow 
split brain you have to use something like vector clocks and handle conflict 
resolution when the splits rejoin, or you will just have a lot of messed up 
data. You generally allow split brain when you want to favor write availability 
in the face of partitions, like Dynamo. But you must have a strategy for 
rejoining splits (like vector clocks or something) or you can never properly go 
back to a single, consistent cluster. We favor consistency in the face of 
partitions rather than write availability. It seemed like the right choice for 
Solr.

- Mark Miller
lucidimagination.com













Re: SolrCloud non-distributed indexing (update.chain)

2012-06-18 Thread Mark Miller
I think this was changed by https://issues.apache.org/jira/browse/SOLR-2822

Add NoOpDistributingUpdateProcessorFactory to your chain to avoid distrib 
update 'action' being auto injected.

- Mark Miller
lucidimagination.com

On Jun 18, 2012, at 8:10 AM, Boon Low wrote:

 Hi,
 
 What's happening to the update.chain of SolrCloud?
 
 I am running SolrCloud (compiled from trunk today) with an update.chain 
 pointing to an updateRequestProcessorChain in solrconfig which omits the 
 DistributedUpdateProcessorFactory, so that indexing can be done on specific 
 shards (not distributed).
 
 This works previously but not in the recent builts (e.g. since 6th June..). I 
 noticed the additional update parameters such as update.distrib being 
 logged across the cloud nodes:
 
 ...update.distrib=TOLEADER update.chain=notdistributed
 
 I tried update.distrib=NONE. The indexing still being distributed and 
 ignoring the update.chain (as specified below in Solr config).
 
   updateRequestProcessorChain name=notdistributed
   processor class=solr.LogUpdateProcessorFactory /
   processor class=solr.RunUpdateProcessorFactory /
   /updateRequestProcessorChain 
 
 How do I get the above chain and non-distributed indexing to work again?
 
 Regards,
 
 Boon
 
 -
 Boon Low
 Search UX and Engine Developer (SOLR)
 brightsolid Online Publishing
 
 
 __
 brightsolid is used in this email to collectively mean brightsolid online 
 innovation limited and its subsidiary companies brightsolid online publishing 
 limited and brightsolid online technology limited.
 findmypast.co.uk is a brand of brightsolid online publishing limited.
 brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
 Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
 brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
 Street, London EC2A 3DQ. Registered in England No. 04369607.
 brightsolid online technology limited, Gateway House, Luna Place, Dundee 
 Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.
 
 Email Disclaimer
 
 This message is confidential and may contain privileged information. You 
 should not disclose its contents to any other person. If you are not the 
 intended recipient, please notify the sender named above immediately. It is 
 expressly declared that this e-mail does not constitute nor form part of a 
 contract or unilateral obligation. Opinions, conclusions and other 
 information in this message that do not relate to the official business of 
 brightsolid shall be understood as neither given nor endorsed by it.
 __
 This email has been scanned by the brightsolid Email Security System. Powered 
 by MessageLabs
 __















Re: WordBreak and default dictionary crash Solr

2012-06-18 Thread Carrie Coy

On 06/15/2012 05:16 PM, Dyer, James wrote:

I'm pretty sure you've found a bug here.  Could you tell me whether you're 
using a build from Trunk or Solr_4x ?  Also, do you know the svn revision or 
the Jenkins build # (or timestamp) you're working from?
I continued to see the problem after updating to version below 
(previously was running version built on 06-09):


   *

 solr-spec
 4.0.0.2012.06.16.10.22.10

   *

 solr-impl
 4.0-2012-06-16_10-02-16 1350899 - hudson - 2012-06-16 10:22:10


Could you try instead to use DirectSolrSpellChecker instead of IndexBasedSpellChecker for 
your default dictionary?


Switching to DirectSolrSpellChecker appears to fix the problem: a query 
with 2 misspellings, one from each dictionary, does not crash Solr and 
is correctly spell-checked.


Thanks!

Carrie Coy


Re: SolrCloud and split-brain

2012-06-18 Thread Otis Gospodnetic
Hi Mark,

Thanks.  All that is clear (I think Voldemort does a good job with hinted 
handoff, which I think Mark is referring to).
The part that I'm not clear about is maybe not SolrCloud-specific, and that is 
- what exactly prevents the two halves of a cluster that's been split from 
thinking they are *the* cluster?
Let's say you have a 10-node cluster, say with 10 ZK instances, one instance on 
each Solr node.
And say 5 of these 10 servers are on switch A and the other 5 are on switch B.
Something happens and switch A and 5 nodes on it get separated from 5 nodes on 
switch B.
Say that both A and B happen to have complete copies of the index.

What in Solr (or ZK) tells either A or B half that no, you are not *the* 
cluster and thou shalt not accept updates?

I'm guessing 
this: https://cwiki.apache.org/confluence/display/ZOOKEEPER/FailureScenarios ?

So then the Q becomes: if we have 10 ZK nodes and they split in 5  5 nodes, 
does that mean neither side will have quorum because having 10 ZKs was a bad 
number of ZKs to have to begin with?

Thanks,
Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




- Original Message -
 From: Mark Miller markrmil...@gmail.com
 To: solr-user solr-user@lucene.apache.org
 Cc: 
 Sent: Monday, June 18, 2012 11:05 AM
 Subject: Re: SolrCloud and split-brain
 
 
 On Jun 15, 2012, at 10:33 PM, Otis Gospodnetic wrote:
 
  However, if my half brain understands what split brain is then I think 
 that's not a completely true claim because one can get unlucky and get a 
 SolrCloud cluster partitioned in a way that one or even all partitions reject 
 indexing (and update and deletion) requests if they do not have a complete 
 index.
 
 That's not split brain. Split brain means that multiple partitioned clusters 
 think they are *the* cluster and would keep accepting updates. This is a real 
 problem because when you unsplit the cluster, you cannot reconcile 
 conflicting 
 updates easily! In many cases you have to ask the user to resolve the 
 conflict.
 
 Yes, you must have a node to serve a shard in order to index to that shard. 
 You 
 do not need the whole index - but if an update hashes to a shard that has no 
 nodes hosting it, it will fail. If there is no node, the document has no 
 where 
 to live. Some systems do interesting things like buffer those updates to 
 other 
 nodes for a while - we don't plan on anything like that soon. At some point, 
 you can only survive a loss of so many nodes before its time to give up 
 accepting updates in any system. If you need to survive catastrophic loss of 
 nodes, you have to have enough replicas to handle it. Whether those nodes are 
 partitioned off from the cluster or simply die, it's all the same. You can 
 only survive so many node loses, and replicas are your defense.
 
 The lack of split-brain allows your cluster to remain consistent. If you 
 allow 
 split brain you have to use something like vector clocks and handle conflict 
 resolution when the splits rejoin, or you will just have a lot of messed up 
 data. You generally allow split brain when you want to favor write 
 availability 
 in the face of partitions, like Dynamo. But you must have a strategy for 
 rejoining splits (like vector clocks or something) or you can never properly 
 go 
 back to a single, consistent cluster. We favor consistency in the face of 
 partitions rather than write availability. It seemed like the right choice 
 for 
 Solr.
 
 - Mark Miller
 lucidimagination.com



Re: SolrCloud and split-brain

2012-06-18 Thread Mark Miller

 So then the Q becomes: if we have 10 ZK nodes and they split in 5  5
 nodes, does that mean neither side will have quorum because having 10 ZKs
 was a bad number of ZKs to have to begin with?


Right - from the ZooKeeper admin guide, under Clustered Setup:

Because Zookeeper requires a majority, it is best to use an odd number of
machines.



-- 
- Mark

http://www.lucidimagination.com


StandardTokenizerFactory behaviour

2012-06-18 Thread Alok Bhandari

Hello ,

I am working on Solr from last few months and stuck some where ,

Analyzer in Field Definition : --

analyzer
  tokenizer class=solr.StandardTokenizerFactory/
/analyzer

In: Please, email john@foo.com by 03-09, re: m37-xq.

Expected Out: Please, email, john@foo.com, by, 03-09, re,
m37-xq

but not getting this. Is something wrong with my understanding of
StandardTokenizer? I am using solr 3.6.
Please let me know what is wrong with this. Thanks


--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: StandardTokenizerFactory behaviour

2012-06-18 Thread Alok Bhandari

Just to make sure that there is no ambiguity the In: Please, email
john@foo.com by 03-09, re: m37-xq. is the input given to this field for
indexing and the Expected Out: Please, email, john@foo.com, by,
03-09, re, m37-xq  is expected output tokens.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215p3990216.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to update one field without losing the others?

2012-06-18 Thread Sami Siren
On Mon, Jun 18, 2012 at 5:03 PM, Kai Gülzau kguel...@novomind.com wrote:
 I'm currently playing around with a branch 4x Version 
 (https://builds.apache.org/job/Solr-4.x/5/) but I don't get field updates to 
 work.

 A simple GET testrequest
 http://localhost:8983/solr/master/update/json?stream.body={add:{doc:{ukey:08154711,type:1,nbody:{set:mycontent

 results in
 {
  ukey:08154711,
  type:1,
  nbody:{set=mycontent}}]
 }

 All fields are stored.
 ukey is the unique key :-)
 type is a required field.
 nbody is a solr.TextField.

With the Solr example (4.x), the following seems to work:

URL=http://localhost:8983/solr/update
curl $URL?commit=true -H 'Content-type:application/json' -d '{ add:
{ doc: { id: id, title: test, price_f: 10 }}}'
curl $URL?commit=true -H 'Content-type:application/json' -d '{ add:
{ doc: { id: id, price_f: {set: 5'

If you are using solrj then there's a junit test method,
testUpdateField(), that does something similar:

http://svn.apache.org/viewvc/lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/SolrExampleTests.java?view=markup

--
 Sami Siren