Differences in debugQuery and results

2011-12-21 Thread roySolr
Hello,

I have some configuration problems and can't get it working. I see some
differences with the debugQuery.

I search for: w.j

((DisjunctionMaxQueryname1_search:w name1_search:j)^5.0) |
((name2_search:w name2_search:j)^5.0))~1.0)

I search for: w j
((DisjunctionMaxQuery((name1_search:w^5.0 | name2_search:w^5.0)~1.0
((DisjunctionMaxQuery((name1_search:j^5.0 | name2_search:j^5.0)~1.0) 

I use the worddelimiter to split on a dot. Why is there a difference? I want
that SOLR handles this issues on the same way? How can i fix this?

CONFIG:
fieldType name=text_delimiter class=solr.TextField
positionIncrementGap=100 autoGeneratePhraseQueries=false
  analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 catenateWords=0 splitOnCaseChange=0
splitOnNumerics=0 stemEnglishPossessive=0/
  /analyzer
  analyzer type=query
charFilter class=solr.HTMLStripCharFilterFactory/
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.TrimFilterFactory/
filter class=solr.WordDelimiterFilterFactory 
generateWordParts=1
catenateWords=0 splitOnCaseChange=0 splitOnNumerics=0
stemEnglishPossessive=0/
  /analyzer
/fieldType
Roy

 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Differences-in-debugQuery-and-results-tp3603817p3603817.html
Sent from the Solr - User mailing list archive at Nabble.com.


blocking access by user-agent

2011-12-21 Thread RT

Hi,

I would like to control what applications get access to the solr 
database. I am using jetty as the appcontainer.


Is this at all achievable? If yes, how?

Internet search has not yielded anything I could use so far.

Thanks in advance.

Roland


Re: blocking access by user-agent

2011-12-21 Thread Patrick Plaatje
Hi Roland,

you can configure Jetty to use a simple .htaccess file to allow only
specific IP adresses access to your webapp. Have a look here on how to do
thta:

http://www.viaboxxsystems.de/how-to-configure-your-jetty-webapp-to-grant-access-for-dedicated-ip-addresses-only

If you want more sophisticated access control, you need it to be included
in an extra layer between Solr and the devices accressing your Solr
instance.


- Patrick


2011/12/21 RT rwatollen...@gmail.com

 Hi,

 I would like to control what applications get access to the solr database.
 I am using jetty as the appcontainer.

 Is this at all achievable? If yes, how?

 Internet search has not yielded anything I could use so far.

 Thanks in advance.

 Roland




-- 
Patrick Plaatje
Senior Consultant
http://www.nmobile.nl/


Re: blocking access by user-agent

2011-12-21 Thread RT

Thanks Patrick,

Would you have any idea in what directory to place the .htaccess file to 
block all information retrieval?


I am not sure HTAccessHandler of jetty supports all its commands. This 
site however implies that it does have the ones for user-agent blocking.


:
http://www.irishwebmasterforum.com/coding-help/7692-block-useragent-with-htaccess.html

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4.0 (compatible; MSIE 6.0; 
Windows NT 5.1; SV1) [OR]

RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.* - [F,L]


regards,

Roland





Patrick Plaatje wrote:

Hi Roland,

you can configure Jetty to use a simple .htaccess file to allow only 
specific IP adresses access to your webapp. Have a look here on how to 
do thta:


http://www.viaboxxsystems.de/how-to-configure-your-jetty-webapp-to-grant-access-for-dedicated-ip-addresses-only

If you want more sophisticated access control, you need it to be 
included in an extra layer between Solr and the devices accressing your 
Solr instance.



- Patrick


2011/12/21 RT rwatollen...@gmail.com mailto:rwatollen...@gmail.com

Hi,

I would like to control what applications get access to the solr
database. I am using jetty as the appcontainer.

Is this at all achievable? If yes, how?

Internet search has not yielded anything I could use so far.

Thanks in advance.

Roland




--
Patrick Plaatje
Senior Consultant
http://www.nmobile.nl/


Querying on dynamic field

2011-12-21 Thread Isan Fulia
Hi,

I hava  a dynamic field E_*
I want to seach for E_abc*:something
Is there any way i can do this in solr.

If not possible in Solr 3.4 , does Solr 4.0 includes wildcard query on
dynamic field.


-- 
Thanks  Regards,
Isan Fulia.


Re: Mapping and Capture in ExtractingRequestHandler

2011-12-21 Thread Erick Erickson
Googling 'solrj examples' points right to a great example here:

http://wiki.apache.org/solr/Solrj

Best
Erick

On Wed, Dec 21, 2011 at 12:04 AM, Swapna Vuppala
swapna.vupp...@arup.com wrote:
 Hi Erick,

 Can you please give me little more information about SolrJ program and how to 
 use it to construct a Solr document ?

 Thanks and Regards,
 Swapna.

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Wednesday, December 21, 2011 2:28 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Mapping and Capture in ExtractingRequestHandler

 When you start getting into complex HTML extraction, you're probably
 better off using a SolrJ program with a forgiving HTML parser
 and extracting the relevant bits yourself and construction a
 SolrDocument.

 FWIW,
 Erick

 On Tue, Dec 20, 2011 at 12:54 AM, Swapna Vuppala
 swapna.vupp...@arup.com wrote:
 Hi,

 I understand that we can specify parameters in ExtractingRequestHandler in 
 solrconfig.xml to capture HTML tags of a particular type and map them to 
 desired solr fields, like something below.

 str name=capturediv/str
 str name=fmap.divmysolrfield/str

 The above setting will capture content in div tags and copy to the solr 
 field mysolrfield.

 What am interested is in capturing div tags with a particular class name to 
 a solr field. When extracting content from outlook messages, I would like to 
 capture the content within div class=message-body to go into a solr 
 field and the content within div class=attachment-entry to go into 
 another solr field.

 Can someone please let me know how to achieve this ?

 Thanks and Regards,
 Swapna.

 
 Electronic mail messages entering and leaving Arup  business
 systems are scanned for acceptability of content and viruses


Re: disable stemming on query parser.

2011-12-21 Thread Erick Erickson
Actually, 1M records isn't all that much for a Solr index, so I'd
simply test with the
copyfield alternative as it's much easier.

About compression: this simply compresses the *stored* data, which has
essentially no effect on index search speed, but will affect the size of
the file (*.fdt) that contains stored data. Here's a good reference:
http://lucene.apache.org/java/3_0_2/fileformats.html#file-names

The fields you copy *to* should probably not be stored (stored=false).

The idea (I thought there was a Solr patch for a new Filter, but I can't
find it) is to have something similar to the SynonymAnalyzer in Lucene
In Action that inserts the special token at, say, the end of the
term *as well as* sending the original term. Say you're indexing
running. Your index process would put in run and running#. Now
when searching, when you want exact match, you search for
the terms with the '#' at the end. Yes, it makes your index larger,
but whatever you do will make the index larger.

Best
Erick

On Wed, Dec 21, 2011 at 5:36 AM, meghana meghana.rav...@amultek.com wrote:
 Hi Dmitry ,

 If we add some unseen character sequence to array , doesn't it remove my
 stemming at all time? how we can manage stemmed and unstemmed words in the
 same field? i am a bit confused on this.

 also i tried with making compression on a field, which i use for copy field,
 what i read about compression on field , it should make your index size
 lower. and it lowers performance a bit while querying , but when i tried it
 on my local solr configuration (which have about 5000 records , and copy
 field size is more than 5000 char or may be much more).  it behave totally
 opposite of it. It increased my index file size and also performance does
 not decrease.  have any idea why it is behaved like this.

 like to make a note that this i tried with my local configuration of solr.
 in live solr , we have more than 10 lakh records , and copy field size is
 very big( about 5000 or much more char)

 Thanks in advance,
 Meghana

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3603675.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Solr 3.5 | Highlighting

2011-12-21 Thread Tanguy Moal

Dear all,

I'm try to get highlighting working, and I'm almost done, but that's not 
perfect yet...


Basically my documents have a title and a description.

I have two kind of text fields :
text :

fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 
catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 
catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

and text_french_light :
fieldType name=text_french_light class=solr.TextField 
positionIncrementGap=100

analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 
catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory /
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 
catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true 
words=stopwords.txt enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt 
ignoreCase=true expand=true/

filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory /
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

I then define my fields the following way :
field name=title type=text indexed=true stored=true 
termVectors=true termPositions=true termOffsets=true/
field name=title_stemmed type=text_french_light indexed=true 
stored=true termVectors=true termPositions=true termOffsets=true/
field name=title_stemmed_nonorms type=text_french_light 
indexed=true stored=false omitNorms=true 
omitTermFreqAndPositions=true/
field name=description type=text indexed=true stored=true 
termVectors=true termPositions=true termOffsets=true/
field name=description_stemmed type=text_french_light 
indexed=true stored=true termVectors=true termPositions=true 
termOffsets=true/
field name=description_stemmed_nonorms type=text_french_light 
indexed=true stored=false omitNorms=true 
omitTermFreqAndPositions=true/

I have the following copyField directives :

copyField source=title dest=title_stemmed /
copyField source=title dest=title_stemmed_nonorms /
copyField source=description dest=description_stemmed /
copyField source=description dest=description_stemmed_nonorms /

I rely on dismax query handler to achieve relevancy.

I have two different search use cases :
- a structured search mode where my query looks like q=Term1 
term2qf=my_category_field^1.0hl.q=Word1 word2mm=100%
- a free-text search mode where my query looks like q=Term1 
term2qf=title_stemmed_nonorms^1.0 description_stemmed_nonorms^0.5mm=-40%


Shared query parameters are as follow : 
defType=dismaxhl=onhl.fl=title_stemmed 
description_stemmedhl.useFastVectorHighlighter=truehl.fragListBuilder=single


For all use cases, I have the good relevancy parameters, my results are 
satisfying.


Troubles concern highlighting :
- in the free-text search mode, everything is fine : the query is not 
a phrase query, and highlighted terms may vary from the query terms (if 
stemming came into play)
- in the structured search mode, I've got less luck : the query is a 
phrase query. Therefor, I rely on the hl.q parameter to achieve my 
needs. However, when specified in the hl.q parameter the query isn't 
processed the same way that it should when trying to highlight from the 
fields : query analysis seems not to be applied.
I can prove it easily by 

Re: disable stemming on query parser.

2011-12-21 Thread meghana
Hi,

So we need to find that solr patch , which do add special character on word
, which i index. i like to add here, that the my copy field is multivalued
field, with many sentences. so do it add special character at each word of
that? 

and for compression, Erik yes, i am storing my copy field (stored=true).
because we also want highlighting on that field when searched, so we have to
store that. then also it does not reduce my index file size (*.fdt). instead
of that , it increasing the index file size. do you think there's any case ,
which increase file size on compression? i really need to do this. 

Thanks 
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3604186.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: disable stemming on query parser.

2011-12-21 Thread meghana
Hi,

So we need to find that solr patch , which do add special character on word
, which i index. i like to add here, that the my copy field is multivalued
field, with many sentences. so do it add special character at each word of
that? 

and for compression, Erik yes, i am storing my copy field (stored=true).
because we also want highlighting on that field when searched, so we have to
store that. then also it does not reduce my index file size (*.fdt). instead
of that , it increasing the index file size. do you think there's any case ,
which increase file size on compression? i really need to do this. 

Thanks 
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3604188.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Where clause

2011-12-21 Thread Gora Mohanty
On Wed, Dec 21, 2011 at 4:58 PM, ayyappan ayyaba...@gmail.com wrote:
 In the information search in DB, the query has to be based on customer
 specific. Is it possible to change “where” clause value in select query
 based on the user who logged in.

It is not clear what you are asking here? Which where clause in what
select query?

If you mean, could a Solr query be tailored to a specific customer, yes
that can be done in various ways, e.g., by having the front-end modify
the query as per the known customer ID, or by having a customer ID
field in the Solr index that is used to tailor results to that customer.

Regards,
Gora


Solr - Mutivalue field search on different elements

2011-12-21 Thread meghana
Hi all,

i need to make a different search on multivalued field. 
for e.g. i have data as below
arr name=xx
  strMichel/str
  strJackson/str
  stris/str
  strgood/str
   strsinger and dancer/str
/arr

if i search using Michel Jackson , then i want above displayed record
should come in result (search word in  in consecutive  element). 

do anybody have any idea?
Thanks
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3604213.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: a question on jmx solr exposure

2011-12-21 Thread Gora Mohanty
On Wed, Dec 21, 2011 at 3:26 PM, Dmitry Kan dmitry@gmail.com wrote:
 Hello list,

 This might be not the right place to ask the jmx specific questions, but I
 decided to try, as we are polling SOLR statistics through jmx.

 We currently have two solr cores with different schemas A and B being run
 under the same tomcat instance. Question is: which stat is jconsole going
 to see under solr/ ?
[...]

Have not tried this, but each core has a separate solrconfig.xml.
Thus, you could specify a different JMX setup for each, e.g., by
changing the name, or port.

Regards,
Gora


Re: disable stemming on query parser.

2011-12-21 Thread Erick Erickson
Why do you think you require compression? It doesn't affect search.
It should lengthen the document load time. You haven't told us how
big your index is yet, so we can't judge a thing about whether you
really need this or not. You haven't told us what your evidence is that
the index size increases. Seeing the index get bigger is probably an
error on your part, but we can't help without knowing more details.

You can look at the Solr JIRAs, but I took a quick scan and nothing
jumped out at me. Maybe I'm remembering things from the user's list
as this question has been asked multiple times, on a second glance
I don't think there's a patch, you'll probably have to write some custom
code.

Best
Erick

On Wed, Dec 21, 2011 at 9:32 AM, meghana meghana.rav...@amultek.com wrote:
 Hi,

 So we need to find that solr patch , which do add special character on word
 , which i index. i like to add here, that the my copy field is multivalued
 field, with many sentences. so do it add special character at each word of
 that?

 and for compression, Erik yes, i am storing my copy field (stored=true).
 because we also want highlighting on that field when searched, so we have to
 store that. then also it does not reduce my index file size (*.fdt). instead
 of that , it increasing the index file size. do you think there's any case ,
 which increase file size on compression? i really need to do this.

 Thanks
 Meghana

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3604188.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr - Mutivalue field search on different elements

2011-12-21 Thread Tanguy Moal

Hello,

I think that the positionIncrementGap  attribute of your field has to 
changed to 0 (instead of 100 by default).


(See 
http://lucene.472066.n3.nabble.com/positionIncrementGap-in-schema-xml-td488338.html 
)


Hope this helps,

--
Tanguy

Le 21/12/2011 15:39, meghana a écrit :

Hi all,

i need to make a different search on multivalued field.
for e.g. i have data as below
arr name=xx
   strMichel/str
   strJackson/str
   stris/str
   strgood/str
strsinger and dancer/str
/arr

if i search using Michel Jackson , then i want above displayed record
should come in result (search word in  in consecutive  element).

do anybody have any idea?
Thanks
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3604213.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Exception using SolrJ

2011-12-21 Thread Shawn Heisey

On 12/21/2011 1:10 AM, Shawn Heisey wrote:

On 12/20/2011 10:33 AM, Otis Gospodnetic wrote:

Shawn,

Give httping a try: http://www.vanheusden.com/httping/

It may reveal something about connection being dropped periodically.
Maybe even a plain ping would show some dropped packets if it's a 
general network and not a Solr-specific issue.


The connections here are gigabit ethernet on the same VLAN, and 
sometimes it happens to cores on the same box that's running the SolrJ 
code, which if all things are sane, never actually goes out the NIC.  
I see no errors on the interface.


bond0 Link encap:Ethernet  HWaddr 00:1C:23:DC:81:53
  inet addr:10.100.0.240  Bcast:10.100.1.255  Mask:255.255.254.0
  inet6 addr: fe80::21c:23ff:fedc:8153/64 Scope:Link
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
  RX packets:453134140 errors:0 dropped:0 overruns:0 frame:0
  TX packets:297893403 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:446857564768 (416.1 GiB)  TX bytes:191134876472 
(178.0 GiB)


BONDING_OPTS=mode=1 miimon=100 updelay=200 downdelay=200 primary=eth0


I realized after sending the ifconfig that errors would probably not 
show on the bonded interface.  Stats are also clear on the slaves:


eth0  Link encap:Ethernet  HWaddr 00:1C:23:DC:81:53
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:454373740 errors:0 dropped:0 overruns:0 frame:0
  TX packets:301194576 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:449062687599 (418.2 GiB)  TX bytes:193031706549 
(179.7 GiB)

  Interrupt:16 Memory:f800-f8012800

eth1  Link encap:Ethernet  HWaddr 00:1C:23:DC:81:53
  UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
  RX packets:2261000 errors:0 dropped:0 overruns:0 frame:0
  TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000
  RX bytes:194331296 (185.3 MiB)  TX bytes:398 (398.0 b)
  Interrupt:16 Memory:f400-f4012800

The switch interfaces are also very clean, as seen below.  They do show 
some output drops, but the percentage of packets is extremely low.


GigabitEthernet0/13 is up, line protocol is up (connected)
  Hardware is Gigabit Ethernet, address is 0024.c3cc.ad0d (bia 
0024.c3cc.ad0d)

  Description: bigindy0 nic1
  MTU 1500 bytes, BW 100 Kbit, DLY 10 usec,
 reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX
  input flow-control is on, output flow-control is unsupported
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input 1y45w, output 00:00:01, output hang never
  Last clearing of show interface counters never
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 74219
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 378000 bits/sec, 81 packets/sec
  5 minute output rate 1863000 bits/sec, 210 packets/sec
 15993961043 packets input, 18181095872276 bytes, 0 no buffer
 Received 31769202 broadcasts (20225268 multicasts)
 0 runts, 0 giants, 0 throttles
 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
 0 watchdog, 20225268 multicast, 0 pause input
 0 input packets with dribble condition detected
 21413035341 packets output, 21796346722157 bytes, 0 underruns
 0 output errors, 0 collisions, 3 interface resets
 0 babbles, 0 late collision, 0 deferred
 0 lost carrier, 0 no carrier, 0 PAUSE output
 0 output buffer failures, 0 output buffers swapped out

switch uptime 2 years, 27 weeks, 4 days, 21 hours, 20 minutes
host uptime 33 days, 16:21

Even if there were the occasional packet being dropped by the switch, 
the TCP stack in Linux should immediately retry that packet and 
everything would be fine, though delayed slightly.  The number of output 
drops here is 0.00035 percent of the total packets output.  One of the 
other machines (in a different switch) shows ten times as many 
switchport drops, but even that is 0.0037 percent of the packets on that 
port.  I have cleared the counters on on all the switches, and after 
twenty minutes and 40 packets output, it's running completely 
clean.  I will keep an eye on those stats and wait for the next 
exception to see if there is a spike in output drops when the problem 
happens.  I don't expect that to be the problem, though.  If it is a 
networking problem, it is most likely to be in the CentOS 6 kernel.  I'd 
like for it to be that simple, but I think the possibility there is small.


I think it's more likely that it's a software problem, and that the 
error was probably mine, but I need help in tracking it down.


Thanks,
Shawn



RE: Spellchecker issue related to exact match of query in spellcheck index

2011-12-21 Thread Pravin Agrawal
Hi James,

Thanks a lot for your reply. The workaround that you suggested is working fine 
of me. Hope to see this enhancement in future releases of solr.

-Pravin

From: Dyer, James [james.d...@ingrambook.com]
Sent: Monday, December 19, 2011 11:11 PM
To: solr-user@lucene.apache.org
Subject: RE: Spellchecker issue related to exact match of query in spellcheck 
index

Pravin,

When using the file-based spell checking option, it will try to give you 
suggestions for every query term regardless of whether or not thwy are in your 
spelling dictionary. Getting the behavior you want would seem to be a worthy 
enhancement, but I don't think it is currently supported.  You might be able to 
work around this if you could get your dictionary terms in the index and then 
use the index-based option instead.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-Original Message-
From: Pravin Agrawal [mailto:pravin_agra...@persistent.co.in]
Sent: Saturday, December 17, 2011 4:51 AM
To: solr-user@lucene.apache.org
Cc: Tushar Adeshara
Subject: Spellchecker issue related to exact match of query in spellcheck index

Hi All,

I am trying to use file based spellchecker in solr 3.4 version and facing below 
issue.

My dictionary file contains following terms
snip
abcd
abcde
abcdef
abcdefg
/snip

However, when checking spelling for abcd, it gives suggestion abcde even though 
the word abcd is present in dictionary file. Here is sample output.

http://10.88.36.192:8080/solr/spell?spellcheck.build=truespellcheck=truespellcheck.collate=trueq=abcd

result name=response numFound=0 start=0/
−
lst name=spellcheck
−
lst name=suggestions
−
lst name=abcd
int name=numFound1/int
int name=startOffset0/int
int name=endOffset4/int
−
arr name=suggestion
strabcde/str
/arr
/lst
str name=collationabcde/str
/lst
/lst
/response


I am expecting spell checker to give no suggestion if the word is already 
present in the dictionary, however it’s not the case as given above. I am using 
configuration as given below. Please let me know if I am missing something or 
its expected behavior. Also please let me know what should be done to get my 
desired output  (i.e. no suggestion if word is already in dictionary).

Thanks in advance.


Configuration:
searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypespellcheck_text/str
lst name=spellchecker
  str name=classnamesolr.FileBasedSpellChecker/str
  str name=namedefault/str
  str name=comparatorClassscore/str
  str name=sourceLocationspellings.txt/str
  str name=characterEncodingUTF-8/str
  str name=spellcheckIndexDir. /spellcheckerFile/str
/lst
  /searchComponent

  requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=spellcheck.onlyMorePopularfalse/str
  str name=spellcheck.extendedResultsfalse/str
  str name=spellcheck.count1/str
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler

Schema.xml has following fieldtype

fieldType name=spellcheck_text class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
 /fieldType


Thanks
Pravin



DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

DISCLAIMER
==
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.


Replication not working

2011-12-21 Thread Dean Pullen
Hi all,

I have an odd problem locally when attempting replication with solr 1.4

The problem is, though the master files get copied to a temp directory in the 
slave data directory (I see this happen at runtime), they are then not copied 
over the actual slave index data.

We were wondering if it was due to the index version of the restored master 
data being behind the slave index version after a restore? Any other ideas 
would be appreciated.

Thanks,

Dean Pullen

Re: Replication not working

2011-12-21 Thread Dean Pullen
E.g. I see this in the slave logs:

2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
1271406570655, generation: 376
2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
1271406571565, generation: 1286
2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting replication 
process
2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
latest index in master: 9
…
2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken for 
download : 23 secs
2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index installed. 
Updating index properties…

Yet the index doesn't change!


On 21 Dec 2011, at 15:37, Dean Pullen wrote:

 Hi all,
 
 I have an odd problem locally when attempting replication with solr 1.4
 
 The problem is, though the master files get copied to a temp directory in the 
 slave data directory (I see this happen at runtime), they are then not copied 
 over the actual slave index data.
 
 We were wondering if it was due to the index version of the restored master 
 data being behind the slave index version after a restore? Any other ideas 
 would be appreciated.
 
 Thanks,
 
 Dean Pullen



Re: URLDataSource delta import

2011-12-21 Thread Alessandro Benedetti
Any News?
I'm also interested in this topic :)

2011/12/12 Brian Lamb brian.l...@journalexperts.com

 Hi all,

 According to

 http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource
 a
 delta-import is not currently implemented for URLDataSource. I say
 currently because I've noticed that such documentation is out of date in
 many places. I wanted to see if this feature had been added yet or if there
 were plans to do so.

 Thanks,

 Brian Lamb




-- 
--

Benedetti Alessandro
Personal Page: http://tigerbolt.altervista.org

Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?

William Blake - Songs of Experience -1794 England


Re: Querying on dynamic field

2011-12-21 Thread Tomás Fernández Löbbe
The only way I know is by using a copyfield at index time that copies
everything from fields called E_* to a field with a known name, then use
that field for searching.

On Wed, Dec 21, 2011 at 9:41 AM, Isan Fulia isan.fu...@germinait.comwrote:

 Hi,

 I hava  a dynamic field E_*
 I want to seach for E_abc*:something
 Is there any way i can do this in solr.

 If not possible in Solr 3.4 , does Solr 4.0 includes wildcard query on
 dynamic field.


 --
 Thanks  Regards,
 Isan Fulia.



Re: Replication not working

2011-12-21 Thread Erick Erickson
You've probably hit it on the head. The slave version is greater than the master
version, so replication isn't necessary. BTW, the version starts
life as a timestamp,
but then is simply incremented on successive commits, which accounts for
what you are seeing.

You should be able to blow the index away on the slave and wait for replication
and go from there.

Another possibility: How much faith do you have in your slave index?
If it's all good,
you could simply copy *that* to the master manually and go from there.

If you're rebuilding your entire index, just blow the master index
away, re-index from
scratch and that should work too (be sure to disable replication
during the rebuild
unless you want a partial index on the slave).

Although copying the files *then* deciding not to use them doesn't seem like
a good thing. Not sure if 3.x has the same behavior or not...

Best
Erick

On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com wrote:
 E.g. I see this in the slave logs:

 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting replication 
 process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken for 
 download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index installed. 
 Updating index properties…

 Yet the index doesn't change!


 On 21 Dec 2011, at 15:37, Dean Pullen wrote:

 Hi all,

 I have an odd problem locally when attempting replication with solr 1.4

 The problem is, though the master files get copied to a temp directory in 
 the slave data directory (I see this happen at runtime), they are then not 
 copied over the actual slave index data.

 We were wondering if it was due to the index version of the restored master 
 data being behind the slave index version after a restore? Any other ideas 
 would be appreciated.

 Thanks,

 Dean Pullen



Re: Solr Tomcat Maximum Heap Memory

2011-12-21 Thread Andre Bois-Crettez

Try running a 64bit  JVM on your 64bits OS, it should work for much
larger heaps sizes, be it Linux or Windows.

Beware that the memory need is around 30% more important with a 64 bits
JVM (bigger object pointers) if you are not using Compressed Oops :
http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop

André


Husain, Yavar wrote:

I know this is a Solr forum however my problem is related to Solr running on 
Tomcat running on Windows 64 bit OS.

I am running a 32 bit JVM on a 64 bit Windows 2008 Server. The max heap space I 
am able to allocate is around 1.5 GB though I have 10 GB of RAM on my system 
and there is no other process running.
I understand the limit of max 2GB of heap space that can be allocated on 
Windows for a process. However I have seen in the forums people state using Xmx 
upto 10G. How is this possible? If I move to Linux, can I get more heap space 
allocated to the process or is it related to JVM?

Simply put, how can I allocate atleast 8GB of RAM as Xmx to Tomcat on my 64 bit 
Windows. The tomcat crashes when I start. Please help.


--
André Bois-Crettez

Search technology, Kelkoo
http://www.kelkoo.com/


Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège social : 8, rue du Sentier 75002 Paris
425 093 069 RCS Paris

Ce message et les pièces jointes sont confidentiels et établis à l'attention 
exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce 
message, merci de le détruire et d'en avertir l'expéditeur.


Re: Replication not working

2011-12-21 Thread Dean Pullen
Thought as much, thanks for the reply.

Is there an easy way of dropping the index on the slave, or do I have to 
manually delta the index files?

Regards,

Dean.



On 21 Dec 2011, at 15:54, Erick Erickson wrote:

 You've probably hit it on the head. The slave version is greater than the 
 master
 version, so replication isn't necessary. BTW, the version starts
 life as a timestamp,
 but then is simply incremented on successive commits, which accounts for
 what you are seeing.
 
 You should be able to blow the index away on the slave and wait for 
 replication
 and go from there.
 
 Another possibility: How much faith do you have in your slave index?
 If it's all good,
 you could simply copy *that* to the master manually and go from there.
 
 If you're rebuilding your entire index, just blow the master index
 away, re-index from
 scratch and that should work too (be sure to disable replication
 during the rebuild
 unless you want a partial index on the slave).
 
 Although copying the files *then* deciding not to use them doesn't seem like
 a good thing. Not sure if 3.x has the same behavior or not...
 
 Best
 Erick
 
 On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com 
 wrote:
 E.g. I see this in the slave logs:
 
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting replication 
 process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken for 
 download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index installed. 
 Updating index properties…
 
 Yet the index doesn't change!
 
 
 On 21 Dec 2011, at 15:37, Dean Pullen wrote:
 
 Hi all,
 
 I have an odd problem locally when attempting replication with solr 1.4
 
 The problem is, though the master files get copied to a temp directory in 
 the slave data directory (I see this happen at runtime), they are then not 
 copied over the actual slave index data.
 
 We were wondering if it was due to the index version of the restored master 
 data being behind the slave index version after a restore? Any other ideas 
 would be appreciated.
 
 Thanks,
 
 Dean Pullen
 



Re: Replication not working

2011-12-21 Thread Dean Pullen
I can't see a way, if the slave is on another server.

We're going to upgrade solr - as you can delete the index after unloading a 
core in this way:

cores?action=UNLOADcore=liveCoredeleteIndex=true

From v3.3 (I think)

On 21 Dec 2011, at 16:11, Dean Pullen wrote:

 Thought as much, thanks for the reply.
 
 Is there an easy way of dropping the index on the slave, or do I have to 
 manually delta the index files?
 
 Regards,
 
 Dean.
 
 
 
 On 21 Dec 2011, at 15:54, Erick Erickson wrote:
 
 You've probably hit it on the head. The slave version is greater than the 
 master
 version, so replication isn't necessary. BTW, the version starts
 life as a timestamp,
 but then is simply incremented on successive commits, which accounts for
 what you are seeing.
 
 You should be able to blow the index away on the slave and wait for 
 replication
 and go from there.
 
 Another possibility: How much faith do you have in your slave index?
 If it's all good,
 you could simply copy *that* to the master manually and go from there.
 
 If you're rebuilding your entire index, just blow the master index
 away, re-index from
 scratch and that should work too (be sure to disable replication
 during the rebuild
 unless you want a partial index on the slave).
 
 Although copying the files *then* deciding not to use them doesn't seem like
 a good thing. Not sure if 3.x has the same behavior or not...
 
 Best
 Erick
 
 On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com 
 wrote:
 E.g. I see this in the slave logs:
 
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting replication 
 process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken for 
 download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index installed. 
 Updating index properties…
 
 Yet the index doesn't change!
 
 
 On 21 Dec 2011, at 15:37, Dean Pullen wrote:
 
 Hi all,
 
 I have an odd problem locally when attempting replication with solr 1.4
 
 The problem is, though the master files get copied to a temp directory in 
 the slave data directory (I see this happen at runtime), they are then not 
 copied over the actual slave index data.
 
 We were wondering if it was due to the index version of the restored 
 master data being behind the slave index version after a restore? Any 
 other ideas would be appreciated.
 
 Thanks,
 
 Dean Pullen
 
 



Re: Exception using SolrJ

2011-12-21 Thread Chantal Ackermann
Hi Shawn,

maybe the requests that fail have a certain pattern - for example that
they are longer than all the others.

Chantal



Re: Replication not working

2011-12-21 Thread Erick Erickson
Be careful deleting the index manually. Delete the entire index directory,
i.e. the data dir has no index directory under it.

About copying the index from the slave to the master, just shut down
the master, delete all the files from the index, and use scp or something
to copy the files in the index from the slave to the master.

Best
Erick

On Wed, Dec 21, 2011 at 11:37 AM, Dean Pullen dean.pul...@semantico.com wrote:
 I can't see a way, if the slave is on another server.

 We're going to upgrade solr - as you can delete the index after unloading a 
 core in this way:

 cores?action=UNLOADcore=liveCoredeleteIndex=true

 From v3.3 (I think)

 On 21 Dec 2011, at 16:11, Dean Pullen wrote:

 Thought as much, thanks for the reply.

 Is there an easy way of dropping the index on the slave, or do I have to 
 manually delta the index files?

 Regards,

 Dean.



 On 21 Dec 2011, at 15:54, Erick Erickson wrote:

 You've probably hit it on the head. The slave version is greater than the 
 master
 version, so replication isn't necessary. BTW, the version starts
 life as a timestamp,
 but then is simply incremented on successive commits, which accounts for
 what you are seeing.

 You should be able to blow the index away on the slave and wait for 
 replication
 and go from there.

 Another possibility: How much faith do you have in your slave index?
 If it's all good,
 you could simply copy *that* to the master manually and go from there.

 If you're rebuilding your entire index, just blow the master index
 away, re-index from
 scratch and that should work too (be sure to disable replication
 during the rebuild
 unless you want a partial index on the slave).

 Although copying the files *then* deciding not to use them doesn't seem like
 a good thing. Not sure if 3.x has the same behavior or not...

 Best
 Erick

 On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com 
 wrote:
 E.g. I see this in the slave logs:

 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting 
 replication process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken 
 for download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index 
 installed. Updating index properties…

 Yet the index doesn't change!


 On 21 Dec 2011, at 15:37, Dean Pullen wrote:

 Hi all,

 I have an odd problem locally when attempting replication with solr 1.4

 The problem is, though the master files get copied to a temp directory in 
 the slave data directory (I see this happen at runtime), they are then 
 not copied over the actual slave index data.

 We were wondering if it was due to the index version of the restored 
 master data being behind the slave index version after a restore? Any 
 other ideas would be appreciated.

 Thanks,

 Dean Pullen





Re: Replication not working

2011-12-21 Thread Dean Pullen
I can't understand, then, how we could ever restore and get replication to work 
without manual intervention!

Dean

On 21 Dec 2011, at 16:37, Dean Pullen wrote:

 I can't see a way, if the slave is on another server.
 
 We're going to upgrade solr - as you can delete the index after unloading a 
 core in this way:
 
 cores?action=UNLOADcore=liveCoredeleteIndex=true
 
 From v3.3 (I think)
 
 On 21 Dec 2011, at 16:11, Dean Pullen wrote:
 
 Thought as much, thanks for the reply.
 
 Is there an easy way of dropping the index on the slave, or do I have to 
 manually delta the index files?
 
 Regards,
 
 Dean.
 
 
 
 On 21 Dec 2011, at 15:54, Erick Erickson wrote:
 
 You've probably hit it on the head. The slave version is greater than the 
 master
 version, so replication isn't necessary. BTW, the version starts
 life as a timestamp,
 but then is simply incremented on successive commits, which accounts for
 what you are seeing.
 
 You should be able to blow the index away on the slave and wait for 
 replication
 and go from there.
 
 Another possibility: How much faith do you have in your slave index?
 If it's all good,
 you could simply copy *that* to the master manually and go from there.
 
 If you're rebuilding your entire index, just blow the master index
 away, re-index from
 scratch and that should work too (be sure to disable replication
 during the rebuild
 unless you want a partial index on the slave).
 
 Although copying the files *then* deciding not to use them doesn't seem like
 a good thing. Not sure if 3.x has the same behavior or not...
 
 Best
 Erick
 
 On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com 
 wrote:
 E.g. I see this in the slave logs:
 
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting 
 replication process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken 
 for download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index 
 installed. Updating index properties…
 
 Yet the index doesn't change!
 
 
 On 21 Dec 2011, at 15:37, Dean Pullen wrote:
 
 Hi all,
 
 I have an odd problem locally when attempting replication with solr 1.4
 
 The problem is, though the master files get copied to a temp directory in 
 the slave data directory (I see this happen at runtime), they are then 
 not copied over the actual slave index data.
 
 We were wondering if it was due to the index version of the restored 
 master data being behind the slave index version after a restore? Any 
 other ideas would be appreciated.
 
 Thanks,
 
 Dean Pullen
 
 
 



Re: Why is query result set caching disabled for grouping queries?

2011-12-21 Thread astubbs
Been looking at the code a bit, and it seems it's not disabled per se, it's
just not there. The normal searcher has an inbuilt result set cache check
before and after executing, where as the Grouping#execute doesn't have any
concept of result cache. 

Can't they just share the same cache and implementation system? 

Has this just simply not been done yet?

There also seems to be a lot of prep and post work around the
Grouping#execute call from QueryComponent that doesn't exist around
searcher#search. Should this code perhaps be refactored into an intermediate
step?

I'm looking at:
QueryComponent#process
Grouping#execute
SolrIndexSearcher#search and getDocList

Cheers for the info.
Antony.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Why-is-query-result-set-caching-disabled-for-grouping-queries-tp3604540p3604696.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Replication not working

2011-12-21 Thread Erick Erickson
You can't. But index restoration should be a very rare thing,
 or you have some lurking problem in your process.

Or this is an XY problem, what problem are you trying to
solve? see: http://people.apache.org/~hossman/#xyproblem

Best
Erick

On Wed, Dec 21, 2011 at 12:21 PM, Dean Pullen dean.pul...@semantico.com wrote:
 I can't understand, then, how we could ever restore and get replication to 
 work without manual intervention!

 Dean

 On 21 Dec 2011, at 16:37, Dean Pullen wrote:

 I can't see a way, if the slave is on another server.

 We're going to upgrade solr - as you can delete the index after unloading a 
 core in this way:

 cores?action=UNLOADcore=liveCoredeleteIndex=true

 From v3.3 (I think)

 On 21 Dec 2011, at 16:11, Dean Pullen wrote:

 Thought as much, thanks for the reply.

 Is there an easy way of dropping the index on the slave, or do I have to 
 manually delta the index files?

 Regards,

 Dean.



 On 21 Dec 2011, at 15:54, Erick Erickson wrote:

 You've probably hit it on the head. The slave version is greater than the 
 master
 version, so replication isn't necessary. BTW, the version starts
 life as a timestamp,
 but then is simply incremented on successive commits, which accounts for
 what you are seeing.

 You should be able to blow the index away on the slave and wait for 
 replication
 and go from there.

 Another possibility: How much faith do you have in your slave index?
 If it's all good,
 you could simply copy *that* to the master manually and go from there.

 If you're rebuilding your entire index, just blow the master index
 away, re-index from
 scratch and that should work too (be sure to disable replication
 during the rebuild
 unless you want a partial index on the slave).

 Although copying the files *then* deciding not to use them doesn't seem 
 like
 a good thing. Not sure if 3.x has the same behavior or not...

 Best
 Erick

 On Wed, Dec 21, 2011 at 10:46 AM, Dean Pullen dean.pul...@semantico.com 
 wrote:
 E.g. I see this in the slave logs:

 2011-12-21 15:45:27,635  INFO handler.SnapPuller:265 - Master's version: 
 1271406570655, generation: 376
 2011-12-21 15:45:27,635  INFO handler.SnapPuller:266 - Slave's version: 
 1271406571565, generation: 1286
 2011-12-21 15:45:27,636  INFO handler.SnapPuller:267 - Starting 
 replication process
 2011-12-21 15:45:27,639  INFO handler.SnapPuller:270 - Number of files in 
 latest index in master: 9
 …
 2011-12-21 15:45:50,997  INFO handler.SnapPuller:286 - Total time taken 
 for download : 23 secs
 2011-12-21 15:45:51,050  INFO handler.SnapPuller:586 - New index 
 installed. Updating index properties…

 Yet the index doesn't change!


 On 21 Dec 2011, at 15:37, Dean Pullen wrote:

 Hi all,

 I have an odd problem locally when attempting replication with solr 1.4

 The problem is, though the master files get copied to a temp directory 
 in the slave data directory (I see this happen at runtime), they are 
 then not copied over the actual slave index data.

 We were wondering if it was due to the index version of the restored 
 master data being behind the slave index version after a restore? Any 
 other ideas would be appreciated.

 Thanks,

 Dean Pullen






Re: Release build or code for SolrCloud

2011-12-21 Thread Dipti Srivastava
Hi Mark,
I built the example and dist and ran the solrcloud.sh script. While
running I get the following error... Is this ok? It appears that some of
the instances got started though.

--CLOUD--[ec2-user@ cloud_dev]$ ./solrcloud.sh
./solrcloud.sh: line 16: ant: command not found
Exception in thread main java.lang.NoClassDefFoundError:
org/apache/solr/cloud/ZkController
Caused by: java.lang.ClassNotFoundException:
org.apache.solr.cloud.ZkController
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.solr.cloud.ZkController.
Program will exit.
--CLOUD--[ec2-user@ cloud_dev]$ ls
solrcloud.sh  stop.sh
--CLOUD--[ec2-user@ cloud_dev]$ cd ..
--CLOUD--[ec2-user@ solrcloud]$ ls
cloud_dev  example  example2  example3  example4  example5  example6
--CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep solr
ec2-user 22690 22452  0 18:12 pts/000:00:00 grep solr
--CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep jetty
ec2-user 22521 1  4 18:11 pts/000:00:02 java -Djetty.port=7574
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6574 -DSTOP.KEY=key -jar
start.jar
ec2-user 22522 1  4 18:11 pts/000:00:02 java -Djetty.port=7575
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6575 -DSTOP.KEY=key -jar
start.jar
ec2-user 22523 1  4 18:11 pts/000:00:02 java -Djetty.port=7576
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6576 -DSTOP.KEY=key -jar
start.jar
ec2-user 22524 1  4 18:11 pts/000:00:02 java -Djetty.port=7577
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6577 -DSTOP.KEY=key -jar
start.jar
ec2-user 22525 1  4 18:11 pts/000:00:02 java -Djetty.port=7578
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6578 -DSTOP.KEY=key -jar
start.jar
ec2-user 22692 22452  0 18:12 pts/000:00:00 grep jetty

Thanks!

Dipti

On 12/20/11 5:32 PM, Mark Miller markrmil...@gmail.com wrote:

You might find the solr/cloud-dev/solrcloud.sh script informative. From a
solrcloud branch checkout, you can run it and it will start up a 2 shard,
6 node cluster with zookeeper running on a single node. stop.sh will
shutdown the 6 nodes. Once you start up the nodes, you can start indexing
and searching on any of them, or use the CloudSolrServer solrj client. It
simply takes the ZooKeeper address and figures out the servers from there
(you do currently still have to pass distrib=true to make requests hit
the whole collection).

There will be more help on getting started produced soon. Still some work
to finish up first.

- Mark


On Dec 20, 2011, at 7:17 PM, Dipti Srivastava wrote:

 Thanks for all responses. I got the code from the trunk. Now I will work
 through rest of the steps.
 Dipti

 On 12/20/11 1:58 PM, Chris Hostetter hossman_luc...@fucit.org wrote:


 :  I am following the 2 shard example from the wiki page
 :  http://wiki.apache.org/solr/SolrCloud#SolrCloud-1

 Everything on that wiki should apply to trunk, as noted on the wiki
page
 itself.

 the solrcloud branch people have mentioned is related to this comment
 from that wiki page...

 ...
 That is what has been done so far on trunk.

 A second initiative has recently begun to finish the distributed
 indexing side of SolrCloud. See
 https://issues.apache.org/jira/browse/SOLR-2358

 ...and the sub issues linked to from there.

 Any time you are looking for a branch, you can find it in the list of
 branches...

 https://svn.apache.org/repos/asf/lucene/dev/branches/
 https://svn.apache.org/repos/asf/lucene/dev/branches/solrcloud/


 -Hoss



 This message is private and confidential. If you have received it in
error, please notify the sender and remove it from your system.



- Mark Miller
lucidimagination.com














This message is private and confidential. If you have received it in error, 
please notify the sender and remove it from your system.




best practice to introducing singletons inside of Solr (IoC)

2011-12-21 Thread Mikhail Khludnev
Hello,

I need to introduce several singletons inside of Solr and make them
available for my own SearchHandlers, Components, and even QParsers, etc.

Right now I use some kind of fake SolrRequestHandler which loads on init()
and available everywhere through
solrCore.getRequestHandler(wellknownName). Then I downcast it everywhere
and access the required methods. The same is possible with fake
SearchComponent.
Particularly my singletons are some additional fields schema (pretty
sophisticated), and kind of request/response encoding facility.
The typical Java hammer for such pins is Spring, but I've found puzzling to
use
http://static.springframework.org/spring/docs/3.0.x/javadoc-api/org/springframework/web/context/support/WebApplicationContextUtils.html

What's the best way to do that?

-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: Release build or code for SolrCloud

2011-12-21 Thread Dipti Srivastava
Ok, so the issue was that I had only copied the cloud_dev, example and
dist directories and that¹s why some of the libraries were missing. I
copied the build, lib as well and got around the issue. Now, I am getting
this error when I run the script to start 6 nodes cluster.

INFO: makePath: /configs/conf1/velocity/jquery.autocomplete.js
Dec 21, 2011 7:24:49 PM org.apache.solr.common.cloud.SolrZkClient makePath
INFO: makePath: /configs/conf1/velocity/query.vm
Dec 21, 2011 7:24:49 PM org.apache.solr.common.cloud.SolrZkClient makePath
INFO: makePath: /configs/conf1/velocity/hit.vm
Dec 21, 2011 7:24:49 PM org.apache.zookeeper.server.ZooKeeperServerMain
runFromConfig
WARNING: Server interrupted
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1186)
at java.lang.Thread.join(Thread.java:1239)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.join(NIOServerCnxnFactory.
java:318)
at
org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServ
erMain.java:113)
at org.apache.solr.cloud.SolrZkServer$1.run(SolrZkServer.java:116)
--CLOUD--[ec2-user@ cloud-dev]$ ps -ef | grep zk
ec2-user 23796 1 21 19:24 pts/000:00:05 java -DzkRun -DnumShards=2
-DSTOP.PORT=7983 -DSTOP.KEY=key -jar start.jar
ec2-user 23797 1 18 19:24 pts/000:00:04 java -Djetty.port=7574
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6574 -DSTOP.KEY=key -jar
start.jar
ec2-user 23798 1 19 19:24 pts/000:00:04 java -Djetty.port=7575
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6575 -DSTOP.KEY=key -jar
start.jar
ec2-user 23799 1 18 19:24 pts/000:00:04 java -Djetty.port=7576
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6576 -DSTOP.KEY=key -jar
start.jar
ec2-user 23800 1 19 19:24 pts/000:00:04 java -Djetty.port=7577
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6577 -DSTOP.KEY=key -jar
start.jar
ec2-user 23801 1 19 19:24 pts/000:00:04 java -Djetty.port=7578
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6578 -DSTOP.KEY=key -jar
start.jar
ec2-user 23998 22962  0 19:25 pts/000:00:00 grep zk
--CLOUD--[ec2-user@ cloud-dev]$



Thanks!
Dipti

On 12/21/11 10:18 AM, Dipti Srivastava dipti.srivast...@apollogrp.edu
wrote:

Hi Mark,
I built the example and dist and ran the solrcloud.sh script. While
running I get the following error... Is this ok? It appears that some of
the instances got started though.

--CLOUD--[ec2-user@ cloud_dev]$ ./solrcloud.sh
./solrcloud.sh: line 16: ant: command not found
Exception in thread main java.lang.NoClassDefFoundError:
org/apache/solr/cloud/ZkController
Caused by: java.lang.ClassNotFoundException:
org.apache.solr.cloud.ZkController
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.solr.cloud.ZkController.
Program will exit.
--CLOUD--[ec2-user@ cloud_dev]$ ls
solrcloud.sh  stop.sh
--CLOUD--[ec2-user@ cloud_dev]$ cd ..
--CLOUD--[ec2-user@ solrcloud]$ ls
cloud_dev  example  example2  example3  example4  example5  example6
--CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep solr
ec2-user 22690 22452  0 18:12 pts/000:00:00 grep solr
--CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep jetty
ec2-user 22521 1  4 18:11 pts/000:00:02 java -Djetty.port=7574
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6574 -DSTOP.KEY=key -jar
start.jar
ec2-user 22522 1  4 18:11 pts/000:00:02 java -Djetty.port=7575
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6575 -DSTOP.KEY=key -jar
start.jar
ec2-user 22523 1  4 18:11 pts/000:00:02 java -Djetty.port=7576
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6576 -DSTOP.KEY=key -jar
start.jar
ec2-user 22524 1  4 18:11 pts/000:00:02 java -Djetty.port=7577
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6577 -DSTOP.KEY=key -jar
start.jar
ec2-user 22525 1  4 18:11 pts/000:00:02 java -Djetty.port=7578
-DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6578 -DSTOP.KEY=key -jar
start.jar
ec2-user 22692 22452  0 18:12 pts/000:00:00 grep jetty

Thanks!

Dipti

On 12/20/11 5:32 PM, Mark Miller markrmil...@gmail.com wrote:

You might find the solr/cloud-dev/solrcloud.sh script informative. From a
solrcloud branch checkout, you can run it and it will start up a 2 shard,
6 node cluster with zookeeper running on a single node. stop.sh will
shutdown the 6 nodes. Once you start up the nodes, you can start indexing
and searching on any of them, or use the CloudSolrServer solrj client. It
simply takes the ZooKeeper address and figures out the servers from there
(you do currently still have to pass 

[ann] Lily 1.1 is out

2011-12-21 Thread Steven Noels
Hi everybody,

if you use 'HBase' and 'Solr' in one sentence, Lily might be worth checking
out. It's a scalable data repository layering a high-level (i.e.
easy-to-use) data model + API on top of HBase, and consistent, reliable
maintenance of a configurable Solr index (which can optionally be sharded).
So you get the benefit of a high-scale data store with flexible searching.
Best of all, Lily is open source - Apache license.

We've just released Lily 1.1 and you can read all about it on
www.lilyproject.org. Notable release features are:

   - complex field types (nested records and more)
   - conditional updates
   - a Java test framework, also allowing you to run the entire Lily stack
   (Hadoop/HBase/Zookeeper/Solr/Lily) in a single JVM
   - a new Java Builder API (we do REST as well)
   - various performance improvements with regards to parallellization
   - server-side plugins or decorators
   - for enterprise customers: a Whirr-based cluster installer

You can read more at http://bit.ly/uCIxV7

Thanks,

Steven.
-- 
Steven Noels
http://outerthought.org/
Scalable Smart Data
Makers of Lily


RE: Poor performance on distributed search

2011-12-21 Thread ku3ia
Hi!
Today I'd added loginfo to Solr here:
~/solr-3.5/solr/core/src/java/org/apache/solr/servlet/SolrDispatchFilter.java

to method
  private void writeResponse(SolrQueryResponse solrRsp, ServletResponse
response,
 QueryResponseWriter responseWriter,
SolrQueryRequest solrReq, Method reqMethod)
  throws IOException {

at

String charset = ContentStreamBase.getCharsetFromContentType(ct);
Writer out = (charset == null || charset.equalsIgnoreCase(UTF-8))
? new OutputStreamWriter(response.getOutputStream(), UTF8)
: new OutputStreamWriter(response.getOutputStream(), charset);
log here
out = new FastWriter(out);
 log here
responseWriter.write(out, solrReq, solrRsp);
 log here
out.flush();
 log here

and what I have:
Dec 21, 2011 8:44:26 AM org.apache.solr.servlet.SolrDispatchFilter
writeResponse
WARNING: webapp=/solr path=/select/
params={fl=RecordID,scorestart=0q=mobilerows=2000} hits=18887 status=0
QTime=2846 OutputStreamWriterTime=1
Dec 21, 2011 8:44:26 AM org.apache.solr.servlet.SolrDispatchFilter
writeResponse
WARNING: webapp=/solr path=/select/
params={fl=RecordID,scorestart=0q=mobilerows=2000} hits=18887 status=0
QTime=2846 FastWriterTime=0
Dec 21, 2011 8:45:41 AM org.apache.solr.servlet.SolrDispatchFilter
writeResponse
WARNING: webapp=/solr path=/select/
params={fl=RecordID,scorestart=0q=mobilerows=2000} hits=18887 status=0
QTime=2846 responseWriterTime=74207
Dec 21, 2011 8:45:41 AM org.apache.solr.servlet.SolrDispatchFilter
writeResponse
WARNING: webapp=/solr path=/select/
params={fl=RecordID,scorestart=0q=mobilerows=2000} hits=18887 status=0
QTime=2846 outFlushTime=0

This is the first query, so Tomcat was started and I'd run this query. This
is on production only on one shard:
* ~29M docs 
* 11 fields 
* ~105M terms 
* size of shard is: 13GB

Have you any ideas, why so long or it is normal read speed time for
responseWriter of first query???

P.S. If I change a query patameter, for example on any keyword the
situations is similar.

Thanks.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Poor-performance-on-distributed-search-tp3590028p3605074.html
Sent from the Solr - User mailing list archive at Nabble.com.


Update schema.xml using solrj APIs

2011-12-21 Thread Ahmed Abdeen Hamed
Hello friend,

I am new to Solrj and I am wondering if there is a away you can update the
schema.xml file via the APIs.

I would appreciate any help.

Thanks very much,
-Ahmed


Re: Release build or code for SolrCloud

2011-12-21 Thread Mark Miller
Hey Dipti - that error is normal - the script fires up a tmp zookeeper
server to upload the conf files to. It then shuts that server down,
which unfortunately logs this exception. Then the first Solr instance will
run a zookeeper server. Uploading the configs ahead of time allows us to be
sure the configs are certainly in zookeeper before the other servers start
to come up. If you where doing it by hand, you could just pass the conf dir
to the first Solr you started to upload the confs - then wait a second and
start the other instances. Its done this other way in the script instead to
eliminate any races.

On Wed, Dec 21, 2011 at 2:35 PM, Dipti Srivastava 
dipti.srivast...@apollogrp.edu wrote:

 Ok, so the issue was that I had only copied the cloud_dev, example and
 dist directories and that¹s why some of the libraries were missing. I
 copied the build, lib as well and got around the issue. Now, I am getting
 this error when I run the script to start 6 nodes cluster.

 INFO: makePath: /configs/conf1/velocity/jquery.autocomplete.js
 Dec 21, 2011 7:24:49 PM org.apache.solr.common.cloud.SolrZkClient makePath
 INFO: makePath: /configs/conf1/velocity/query.vm
 Dec 21, 2011 7:24:49 PM org.apache.solr.common.cloud.SolrZkClient makePath
 INFO: makePath: /configs/conf1/velocity/hit.vm
 Dec 21, 2011 7:24:49 PM org.apache.zookeeper.server.ZooKeeperServerMain
 runFromConfig
 WARNING: Server interrupted
 java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1186)
at java.lang.Thread.join(Thread.java:1239)
at
 org.apache.zookeeper.server.NIOServerCnxnFactory.join(NIOServerCnxnFactory.
 java:318)
at
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServ
 erMain.java:113)
at org.apache.solr.cloud.SolrZkServer$1.run(SolrZkServer.java:116)
 --CLOUD--[ec2-user@ cloud-dev]$ ps -ef | grep zk
 ec2-user 23796 1 21 19:24 pts/000:00:05 java -DzkRun -DnumShards=2
 -DSTOP.PORT=7983 -DSTOP.KEY=key -jar start.jar
 ec2-user 23797 1 18 19:24 pts/000:00:04 java -Djetty.port=7574
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6574 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 23798 1 19 19:24 pts/000:00:04 java -Djetty.port=7575
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6575 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 23799 1 18 19:24 pts/000:00:04 java -Djetty.port=7576
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6576 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 23800 1 19 19:24 pts/000:00:04 java -Djetty.port=7577
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6577 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 23801 1 19 19:24 pts/000:00:04 java -Djetty.port=7578
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6578 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 23998 22962  0 19:25 pts/000:00:00 grep zk
 --CLOUD--[ec2-user@ cloud-dev]$



 Thanks!
 Dipti

 On 12/21/11 10:18 AM, Dipti Srivastava dipti.srivast...@apollogrp.edu
 wrote:

 Hi Mark,
 I built the example and dist and ran the solrcloud.sh script. While
 running I get the following error... Is this ok? It appears that some of
 the instances got started though.
 
 --CLOUD--[ec2-user@ cloud_dev]$ ./solrcloud.sh
 ./solrcloud.sh: line 16: ant: command not found
 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/solr/cloud/ZkController
 Caused by: java.lang.ClassNotFoundException:
 org.apache.solr.cloud.ZkController
 at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.solr.cloud.ZkController.
 Program will exit.
 --CLOUD--[ec2-user@ cloud_dev]$ ls
 solrcloud.sh  stop.sh
 --CLOUD--[ec2-user@ cloud_dev]$ cd ..
 --CLOUD--[ec2-user@ solrcloud]$ ls
 cloud_dev  example  example2  example3  example4  example5  example6
 --CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep solr
 ec2-user 22690 22452  0 18:12 pts/000:00:00 grep solr
 --CLOUD--[ec2-user@ solrcloud]$ ps -ef | grep jetty
 ec2-user 22521 1  4 18:11 pts/000:00:02 java -Djetty.port=7574
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6574 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 22522 1  4 18:11 pts/000:00:02 java -Djetty.port=7575
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6575 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 22523 1  4 18:11 pts/000:00:02 java -Djetty.port=7576
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6576 -DSTOP.KEY=key -jar
 start.jar
 ec2-user 22524 1  4 18:11 pts/000:00:02 java -Djetty.port=7577
 -DzkHost=localhost:9983 -DnumShards=2 -DSTOP.PORT=6577 -DSTOP.KEY=key -jar
 

Re: Solr 3.5 | Highlighting

2011-12-21 Thread Koji Sekiguchi

(11/12/21 22:28), Tanguy Moal wrote:

Dear all,

I'm try to get highlighting working, and I'm almost done, but that's not 
perfect yet...

Basically my documents have a title and a description.

I have two kind of text fields :
text :

fieldType name=text class=solr.TextField positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1
catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt
enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true 
expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1
catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt
enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true 
expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

and text_french_light :

fieldType name=text_french_light class=solr.TextField 
positionIncrementGap=100
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1
catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt
enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true 
expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory /
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1
catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/
filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt
enablePositionIncrements=true /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true 
expand=true/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.FrenchLightStemFilterFactory /
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
/fieldType

I then define my fields the following way :

field name=title type=text indexed=true stored=true termVectors=true
termPositions=true termOffsets=true/
field name=title_stemmed type=text_french_light indexed=true 
stored=true
termVectors=true termPositions=true termOffsets=true/
field name=title_stemmed_nonorms type=text_french_light indexed=true 
stored=false
omitNorms=true omitTermFreqAndPositions=true/
field name=description type=text indexed=true stored=true 
termVectors=true
termPositions=true termOffsets=true/
field name=description_stemmed type=text_french_light indexed=true 
stored=true
termVectors=true termPositions=true termOffsets=true/
field name=description_stemmed_nonorms type=text_french_light indexed=true 
stored=false
omitNorms=true omitTermFreqAndPositions=true/

I have the following copyField directives :

copyField source=title dest=title_stemmed /
copyField source=title dest=title_stemmed_nonorms /
copyField source=description dest=description_stemmed /
copyField source=description dest=description_stemmed_nonorms /

I rely on dismax query handler to achieve relevancy.

I have two different search use cases :
- a structured search mode where my query looks like q=Term1
term2qf=my_category_field^1.0hl.q=Word1 word2mm=100%
- a free-text search mode where my query looks like q=Term1 
term2qf=title_stemmed_nonorms^1.0
description_stemmed_nonorms^0.5mm=-40%

Shared query parameters are as follow : defType=dismaxhl=onhl.fl=title_stemmed
description_stemmedhl.useFastVectorHighlighter=truehl.fragListBuilder=single

For all use cases, I have the good relevancy parameters, my results are 
satisfying.

Troubles concern highlighting :
- in the free-text search mode, everything is fine : the query is not a 
phrase query, and
highlighted terms may vary from the query terms (if stemming came into play)
- in the structured search mode, I've got less luck : the query is a phrase 
query. Therefor, I
rely on the hl.q parameter to achieve my needs. However, when specified in the 
hl.q parameter the
query isn't processed the same way that it should when trying to highlight from 
the fields : query
analysis seems not to be applied.
I can prove 

Re: Release build or code for SolrCloud

2011-12-21 Thread Dipti Srivastava
Hi Mark,
Thanks! So now I am deploying a 4 node cluster on AMI's and the main
instance that bootstraps the config to the zookeeper does not come up I
get an exception as follows. My solrcloud.sh looks like

#!/usr/bin/env bash

cd ..

rm -r -f example/solr/zoo_data
rm -f example/example.log

cd example
#java -DzkRun -DnumShards=2 -DSTOP.PORT=7983 -DSTOP.KEY=key -jar start.jar
1example.log 21 
java -Dbootstrap_confdir=./solr/conf -DzkRun
-DzkHost=ami-1:9983,ami-2:9983,ami-3:9983 -DnumShards=2 -jar
start.jar




And when I RUN it

--CLOUD--[ec2-user@ cloud-dev]$ ./solrcloud.sh
2011-12-22 02:18:23.352:INFO::Logging to STDERR via
org.mortbay.log.StdErrLog
2011-12-22 02:18:23.510:INFO::jetty-6.1-SNAPSHOT
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to 'solr/'
Dec 22, 2011 2:18:23 AM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /home/ec2-user/solrcloud/example/solr/solr.xml
Dec 22, 2011 2:18:23 AM org.apache.solr.core.CoreContainer init
INFO: New CoreContainer 1406140084
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: JNDI not configured for solr (NoInitialContextEx)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Dec 22, 2011 2:18:23 AM org.apache.solr.core.SolrResourceLoader init
INFO: Solr home set to 'solr/'
Dec 22, 2011 2:18:24 AM org.apache.solr.cloud.SolrZkServerProps
getProperties
INFO: Reading configuration from: solr/zoo.cfg
Dec 22, 2011 2:18:24 AM org.apache.solr.cloud.SolrZkServerProps
parseProperties
INFO: Defaulting to majority quorums
Dec 22, 2011 2:18:24 AM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start Solr. Check solr/home property and the logs
java.lang.IllegalArgumentException: port out of range:-1
at java.net.InetSocketAddress.init(InetSocketAddress.java:83)
at java.net.InetSocketAddress.init(InetSocketAddress.java:63)
at
org.apache.solr.cloud.SolrZkServerProps.setClientPort(SolrZkServer.java:310
)
at
org.apache.solr.cloud.SolrZkServerProps.getMySeverId(SolrZkServer.java:273)
at
org.apache.solr.cloud.SolrZkServerProps.parseProperties(SolrZkServer.java:4
50)
at org.apache.solr.cloud.SolrZkServer.parseConfig(SolrZkServer.java:85)
at
org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:147)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:329)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:282)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.jav
a:231)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93)
at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713
)
at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
at
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282
)
at
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
at 
org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:
152)
at
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCo
llection.java:156)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:
152)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
at org.mortbay.jetty.Server.doStart(Server.java:224)
at
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Re: Update schema.xml using solrj APIs

2011-12-21 Thread Otis Gospodnetic
Ahmed,

At this point in time - no.  You need to edit it manually and restart Solr to 
see the changed.
This will change in the future.

Otis

Performance Monitoring SaaS for Solr - 
http://sematext.com/spm/solr-performance-monitoring/index.html




 From: Ahmed Abdeen Hamed ahmed.elma...@gmail.com
To: solr-user@lucene.apache.org 
Sent: Wednesday, December 21, 2011 4:12 PM
Subject: Update schema.xml using solrj APIs
 
Hello friend,

I am new to Solrj and I am wondering if there is a away you can update the
schema.xml file via the APIs.

I would appreciate any help.

Thanks very much,
-Ahmed




Re: Solr - Mutivalue field search on different elements

2011-12-21 Thread meghana
Hi Tanguy,

Thanks for your reply.. this is really useful. but i have one questions on
that.

my multivalued field is not just simple text. it has values like below
str1s:[This is very nice day.]/str
str3s:[Christmas is about come and christmas]/str
str4s:[preparation is just on ]/str
now i if i search with christmas preparation , then this should match. if
i set positionIncrementGap to 0 then do it will match? Or how value of
positionIncrementGap behave on my search?

Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Mutivalue-field-search-on-different-elements-tp3604213p3605938.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Exception using SolrJ

2011-12-21 Thread Shawn Heisey

On 12/20/2011 10:33 AM, Otis Gospodnetic wrote:

Shawn,

Give httping a try: http://www.vanheusden.com/httping/

It may reveal something about connection being dropped periodically.
Maybe even a plain ping would show some dropped packets if it's a general 
network and not a Solr-specific issue.


The connections here are gigabit ethernet on the same VLAN, and 
sometimes it happens to cores on the same box that's running the SolrJ 
code, which if all things are sane, never actually goes out the NIC.  I 
see no errors on the interface.


bond0 Link encap:Ethernet  HWaddr 00:1C:23:DC:81:53
  inet addr:10.100.0.240  Bcast:10.100.1.255  Mask:255.255.254.0
  inet6 addr: fe80::21c:23ff:fedc:8153/64 Scope:Link
  UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
  RX packets:453134140 errors:0 dropped:0 overruns:0 frame:0
  TX packets:297893403 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:446857564768 (416.1 GiB)  TX bytes:191134876472 
(178.0 GiB)


BONDING_OPTS=mode=1 miimon=100 updelay=200 downdelay=200 primary=eth0

Thanks,
Shawn



a question on jmx solr exposure

2011-12-21 Thread Dmitry Kan
Hello list,

This might be not the right place to ask the jmx specific questions, but I
decided to try, as we are polling SOLR statistics through jmx.

We currently have two solr cores with different schemas A and B being run
under the same tomcat instance. Question is: which stat is jconsole going
to see under solr/ ?

From the numbers (e.g. numDocs of searcher), jconsole see the stats of A.
Where do stats of B go? Or is firstly activated core will capture the jmx
pipe and won't let B's stats to go through?

-- 
Regards,

Dmitry Kan


Re: disable stemming on query parser.

2011-12-21 Thread meghana
Hi Dmitry , 

If we add some unseen character sequence to array , doesn't it remove my
stemming at all time? how we can manage stemmed and unstemmed words in the
same field? i am a bit confused on this. 

also i tried with making compression on a field, which i use for copy field,
what i read about compression on field , it should make your index size
lower. and it lowers performance a bit while querying , but when i tried it
on my local solr configuration (which have about 5000 records , and copy
field size is more than 5000 char or may be much more).  it behave totally
opposite of it. It increased my index file size and also performance does
not decrease.  have any idea why it is behaved like this.

like to make a note that this i tried with my local configuration of solr.
in live solr , we have more than 10 lakh records , and copy field size is
very big( about 5000 or much more char)

Thanks in advance,
Meghana

--
View this message in context: 
http://lucene.472066.n3.nabble.com/disable-stemming-on-query-parser-tp3591420p3603675.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Tomcat Maximum Heap Memory

2011-12-21 Thread Husain, Yavar
I know this is a Solr forum however my problem is related to Solr running on 
Tomcat running on Windows 64 bit OS.

I am running a 32 bit JVM on a 64 bit Windows 2008 Server. The max heap space I 
am able to allocate is around 1.5 GB though I have 10 GB of RAM on my system 
and there is no other process running.
I understand the limit of max 2GB of heap space that can be allocated on 
Windows for a process. However I have seen in the forums people state using Xmx 
upto 10G. How is this possible? If I move to Linux, can I get more heap space 
allocated to the process or is it related to JVM?

Simply put, how can I allocate atleast 8GB of RAM as Xmx to Tomcat on my 64 bit 
Windows. The tomcat crashes when I start. Please help.
/PRE
BR
**BRThis
 message may contain confidential or proprietary information intended only for 
the use of theBRaddressee(s) named above or may contain information that is 
legally privileged. If you areBRnot the intended addressee, or the person 
responsible for delivering it to the intended addressee,BRyou are hereby 
notified that reading, disseminating, distributing or copying this message is 
strictlyBRprohibited. If you have received this message by mistake, please 
immediately notify us byBRreplying to the message and delete the original 
message and any copies immediately thereafter.BR
BR
Thank you.~BR
**BR
FAFLDBR
PRE


Re: a question on jmx solr exposure

2011-12-21 Thread Dmitry Kan
Solved by exposing jmx only on one of the cores, as it is of a more
interest than the other one.

Dmitry

On Wed, Dec 21, 2011 at 11:56 AM, Dmitry Kan dmitry@gmail.com wrote:

 Hello list,

 This might be not the right place to ask the jmx specific questions, but I
 decided to try, as we are polling SOLR statistics through jmx.

 We currently have two solr cores with different schemas A and B being run
 under the same tomcat instance. Question is: which stat is jconsole going
 to see under solr/ ?

 From the numbers (e.g. numDocs of searcher), jconsole see the stats of A.
 Where do stats of B go? Or is firstly activated core will capture the jmx
 pipe and won't let B's stats to go through?

 --
 Regards,

 Dmitry Kan



solr.home

2011-12-21 Thread Thomas Fischer
Hello,

I'm trying to move forward with my solr system from 1.4 to 3.5 and ran into 
some problems with solr home.
Is this a known problem?

My solr 1.4 gives me the following messages (amongst many many others…) in 
catalina.out:

INFO: No /solr/home in JNDI
INFO: using system property solr.solr.home: '/srv/solr'
INFO: looking for solr.xml: /'/srv/solr'/solr.xml

then finds the solr.xml and proceeds from there (this is multicore).

With solr 3.5 I get:

INFO: No /solr/home in JNDI
INFO: using system property solr.solr.home: '/srv/solr'
INFO: Solr home set to ''/srv/solr'/'
INFO: Solr home set to ''/srv/solr'/./'
SCHWERWIEGEND: java.lang.RuntimeException: Can't find resource '' in classpath 
or ''/srv/solr'/./conf/', cwd=/

After that solr is somehow started but not aware of the cores present.

This can be solved by putting a solr.xml file into 
$CATALINA_HOME/conf/Catalina/localhost/ with
Environment name=solr/home type=java.lang.String value=/srv/solr 
override=true /
which results in
INFO: Using JNDI solr.home: /srv/solr
and everything seems to run smoothely afterwards, although solr.xml is never 
mentioned.

I would like to know when this changed and why, and why solr 3.5 is looking for 
solrconfig.xml instead of solr.xml in solr.home

(Am I the only one who finds it confusing to have the three names 
solr.solr.home (system property),  solr.home (JNDI), solr/home (Environment 
name) for the same object?)

Best
Thomas