Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
Hi,

I need to know how can I retrieve phonetic codes. Does solr provide it as
part of result? I need codes for record matching.

*following is schema fragment:*

fieldtype name=phonetic stored=true indexed=true
class=solr.TextField 
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.DoubleMetaphoneFilterFactory inject=true
maxCodeLength=4/
  /analyzer
/fieldtype

 field name=firstname type=text_general indexed=true stored=true/
  field name=firstname_phonetic type=phonetic /
  field name=lastname_phonetic type=phonetic /
  field name=lastname type=text_general indexed=true stored=true/

copyField source=lastname dest=lastname_phonetic/
 copyField source=firstname dest=firstname_phonetic/


Re: Retrieving Phonetic Code as result

2015-01-22 Thread Shawn Heisey
On 1/22/2015 6:42 AM, Amit Jha wrote:
 I need to know how can I retrieve phonetic codes. Does solr provide it as
 part of result? I need codes for record matching.
 
 *following is schema fragment:*
 
 fieldtype name=phonetic stored=true indexed=true
 class=solr.TextField 
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.DoubleMetaphoneFilterFactory inject=true
 maxCodeLength=4/
   /analyzer
 /fieldtype
 
  field name=firstname type=text_general indexed=true stored=true/
   field name=firstname_phonetic type=phonetic /
   field name=lastname_phonetic type=phonetic /
   field name=lastname type=text_general indexed=true stored=true/
 
 copyField source=lastname dest=lastname_phonetic/
  copyField source=firstname dest=firstname_phonetic/

The indexed data (which would include the phonetic transformation) is
never returned in results.  The returned results are ALWAYS the original
values, unaffected by analysis.

If the field does not have docValues enabled (yours does not), then the
indexed values are available in facets ... but there is no way to see
the direct relationship between facet values and individual documents.

For the most flexibility with seeing index contents, you could make a
copy of your index directory and load it into a separate program -- Luke.

https://github.com/DmitryKey/luke/releases/tag/luke-4.10.1

You can also enable the debugQuery parameter on a query to see how the
score is calculated, which does include some information about indexed
values.  It takes time and a fair amount of experience to read the debug
data successfully, and a query with debug is noticeably slower than without.

One last bit of information:  If you know what the stored value is, you
use that value on the Analysis page in the Solr admin UI and see what
the final indexed terms (tokens) are.

Thanks,
Shawn



Re: Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
Hi,

I need to know how can I retrieve phonetic codes. Does solr provide it as
part of result? I need codes for record matching.

*following is schema fragment:*

fieldtype name=phonetic stored=true indexed=true
class=solr.TextField 
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.DoubleMetaphoneFilterFactory inject=true
maxCodeLength=4/
  /analyzer
/fieldtype

 field name=firstname type=text_general indexed=true stored=true/
  field name=firstname_phonetic type=phonetic /
  field name=lastname_phonetic type=phonetic /
  field name=lastname type=text_general indexed=true stored=true/

copyField source=lastname dest=lastname_phonetic/
 copyField source=firstname dest=firstname_phonetic/

Hi,

Thanks for response, I can see generated MetaPhone codes using Luke. I am
using solr only because it creates the phonetic code at time of indexing.
Otherwise for each record I need to call Metaphone algorithm in realtime to
get the codes and compare them. I think when luke can read and display it,
why can't solr?


Re: Retrieving Phonetic Code as result

2015-01-22 Thread Amit Jha
Thanks for response, I can see generated MetaPhone codes using Luke. I am
using solr only because it creates the phonetic code at time of indexing.
Otherwise for each record I need to call Metaphone algorithm in realtime to
get the codes and compare them. I think when luke can read and display it,
why can't solr

On Thu, Jan 22, 2015 at 7:54 PM, Amit Jha shanuu@gmail.com wrote:

 Hi,

 I need to know how can I retrieve phonetic codes. Does solr provide it as
 part of result? I need codes for record matching.

 *following is schema fragment:*

 fieldtype name=phonetic stored=true indexed=true
 class=solr.TextField 
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.DoubleMetaphoneFilterFactory inject=true
 maxCodeLength=4/
   /analyzer
 /fieldtype

  field name=firstname type=text_general indexed=true stored=true/
   field name=firstname_phonetic type=phonetic /
   field name=lastname_phonetic type=phonetic /
   field name=lastname type=text_general indexed=true stored=true/

 copyField source=lastname dest=lastname_phonetic/
  copyField source=firstname dest=firstname_phonetic/

 Hi,

 Thanks for response, I can see generated MetaPhone codes using Luke. I am
 using solr only because it creates the phonetic code at time of indexing.
 Otherwise for each record I need to call Metaphone algorithm in realtime to
 get the codes and compare them. I think when luke can read and display it,
 why can't solr?




RE: Field collapsing memory usage

2015-01-22 Thread Toke Eskildsen
Norgorn [lsunnyd...@mail.ru] wrote:
 Nice, thanks!
 If u'd like to, I'll write our results with that amazing util.

By all means, please do. Good as well as bad. Independent testing is needed to 
ensure proper working tools.

- Toke Eskildsen


Re: If I change schema.xml then reIndex is neccessary in Solr or not?

2015-01-22 Thread Vishal Swaroop
We noticed that SOLR/ Tomcat also needs a restart... is it same for you
also ?

Regards


On Thu, Jan 22, 2015 at 2:11 AM, Nitin Solanki nitinml...@gmail.com wrote:

 Ok. Thanx

 On Thu, Jan 22, 2015 at 11:38 AM, Gora Mohanty g...@mimirtech.com wrote:

  On 22 January 2015 at 11:23, Nitin Solanki nitinml...@gmail.com wrote:
   I *indexed* *2GB* of data. Now I want to *change* the *type* of *field*
   from *textSpell* to *string* type into
 
  Yes, one would need to reindex.
 
  Regards,
  Gora
 



Re: If I change schema.xml then reIndex is neccessary in Solr or not?

2015-01-22 Thread Shawn Heisey
On 1/22/2015 6:25 AM, Vishal Swaroop wrote:
 We noticed that SOLR/ Tomcat also needs a restart... is it same for you
 also ?

For a change in solrconfig or schema to become effective, the core or
collection must be reloaded, or a container restart is required.  Once
the change is active because of the reload or restart, a reindex may be
required, depending on the nature of the change.

Thanks,
Shawn



Re: Retrieving Phonetic Code as result

2015-01-22 Thread Alexandre Rafalovitch
What are you actually trying to do on a business level? Because this
feels like an XY Problem:
https://people.apache.org/~hossman/#xyproblem

Solr will generate MetaPhone during indexing, then during Query and
will do the matching. It's not clear why you actually want to get
those codes back to outside of Solr.

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 22 January 2015 at 09:24, Amit Jha shanuu@gmail.com wrote:
 Hi,

 I need to know how can I retrieve phonetic codes. Does solr provide it as
 part of result? I need codes for record matching.

 *following is schema fragment:*

 fieldtype name=phonetic stored=true indexed=true
 class=solr.TextField 
   analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.DoubleMetaphoneFilterFactory inject=true
 maxCodeLength=4/
   /analyzer
 /fieldtype

  field name=firstname type=text_general indexed=true stored=true/
   field name=firstname_phonetic type=phonetic /
   field name=lastname_phonetic type=phonetic /
   field name=lastname type=text_general indexed=true stored=true/

 copyField source=lastname dest=lastname_phonetic/
  copyField source=firstname dest=firstname_phonetic/

 Hi,

 Thanks for response, I can see generated MetaPhone codes using Luke. I am
 using solr only because it creates the phonetic code at time of indexing.
 Otherwise for each record I need to call Metaphone algorithm in realtime to
 get the codes and compare them. I think when luke can read and display it,
 why can't solr?


RE: Field collapsing memory usage

2015-01-22 Thread Norgorn
Nice, thanks!
If u'd like to, I'll write our results with that amazing util.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092p4181159.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Field collapsing memory usage

2015-01-22 Thread Norgorn
Thank you for your answer.
We've found out that the problem was in our SOLR spec (heliosearch 0.08).
There are no crushes, after changing to 4.10.3 (although, there are lot of
OOMs while handling query, it's not really strange for 1.1 bil of documents
).
Now we are going to try latest Heliosearch.

Is there any way to make 'docValues=true' without reindexing?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Field-collapsing-memory-usage-tp4181092p4181108.html
Sent from the Solr - User mailing list archive at Nabble.com.


WordDelimiterFilterFactory and position increment.

2015-01-22 Thread Modassar Ather
Hi,

I am using WordDelimiterFilter while indexing. Parser used is edismax.
Phrase search is failing for terms like 3d image.

On the analysis page it shows following four tokens for *3d* and there
positions.

*token  position*
3d   1
3 1
3d   1
d 2

image 3

Here the token d is at position 2 which per my understanding causes the
phrase search 3d image fail.
3d image~1 works fine. Same behavior is present for wi-fi device and
other few queries starting with token which is tokenized as shown above in
the table.

Kindly help me understand the behavior and let me know how the phrase
search is possible in such cases without the slop.

Thanks,
Modassar


SolrCloud timing out marking node as down during startup.

2015-01-22 Thread Michael Roberts
Hi,

I'm seeing some odd behavior that I am hoping someone could explain to me.

The configuration I'm using to repro the issue, has a ZK cluster and a single 
Solr instance. The instance has 10 Cores, and none of the cores are sharded.

The initial startup is fine, the Solr instance comes up and we build our index. 
However if the Solr instance exits uncleanly (killed rather than sent a 
SIGINT), the next time it starts I see the following in the logs.

2015-01-22 09:56:23.236 -0800 (,,,) localhost-startStop-1 : INFO  
org.apache.solr.common.cloud.ZkStateReader - Updating cluster state from 
ZooKeeper...
2015-01-22 09:56:30.008 -0800 (,,,) localhost-startStop-1-EventThread : DEBUG 
org.apache.solr.common.cloud.SolrZkClient - Submitting job to respond to event 
WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/live_nodes
2015-01-22 09:56:30.008 -0800 (,,,) zkCallback-2-thread-1 : DEBUG 
org.apache.solr.common.cloud.ZkStateReader - Updating live nodes... (0)
2015-01-22 09:57:24.102 -0800 (,,,) localhost-startStop-1 : WARN  
org.apache.solr.cloud.ZkController - Timed out waiting to see all nodes 
published as DOWN in our cluster state.
2015-01-22 09:57:24.102 -0800 (,,,) localhost-startStop-1 : INFO  
org.apache.solr.cloud.ZkController - Register node as live in 
ZooKeeper:/live_nodes/10.18.8.113:11000_solr
My question is about Timed out waiting to see all nodes published as DOWN in 
our cluster state.

Cursory look at the code, we seem to iterate through all Collections/Shards, 
and mark the state as Down. These notifications are offered to the Overseer, 
who I believe updates the ZK state. We then wait for the ZK state to update, 
with the 60 second timeout.

However, it looks like the Overseer is not started until after we wait for the 
timeout. So, in a single instance scenario we'll always have to wait for the 
timeout.

Is this the expected behavior (and just a side effect of running a single 
instance in cloud mode), or is my understanding of the Overseer/Zk relationhip 
incorrect?

Thanks.

.Mike



RE: Field collapsing memory usage

2015-01-22 Thread Toke Eskildsen
Norgorn [lsunnyd...@mail.ru] wrote:
 Is there any way to make 'docValues=true' without reindexing?

Depends on how brave you are :-)

We recently had the same need and made 
https://github.com/netarchivesuite/dvenabler
To my knowledge that is the only existing tool for that task an as we are the 
only ones having used it, robustness is not guaranteed. Warnings aside, it 
works without problems in our tests as well as the few real corpuses we have 
tested on. It does use a fairly memory hungry structure during the conversion. 
If the number of _unique_ values in your grouping field approaches 1b, I 
loosely guess that you will need 40GB+ of heap. Do read 
https://github.com/netarchivesuite/dvenabler/issues/14 if you want to try it.

- Toke Eskildsen


Suggester Example In Documentation Not Working

2015-01-22 Thread Charles Sanders
Attempting to follow the documentation found here: 
https://cwiki.apache.org/confluence/display/solr/Suggester 

The example given in the documentation is not working. See below my 
configuration. I only changed the field names to those in my schema. Can anyone 
provide an example for this component that actually works? 

searchComponent name=suggest class=solr.SuggestComponent 
lst name=suggester 
str name=namemySuggester/str 
str name=lookupImplFuzzyLookupFactory/str 
str name=dictionaryImplDocumentDictionaryFactory/str 
str name=fieldsugg_allText/str 
str name=weightFieldsuggestWeight/str 
str name=suggestAnalyzerFieldTypestring/str 
/lst 
/searchComponent 

requestHandler name=/suggest class=solr.SearchHandler startup=lazy 
lst name=defaults 
str name=suggesttrue/str 
str name=suggest.count10/str 
str name=suggest.buildtrue/str 
/lst 
arr name=components 
strsuggest/str 
/arr 
/requestHandler 

field name=sugg_allText type=string indexed=true multiValued=true 
stored=false/ 
field name=suggestWeight type=long indexed=true stored=true 
default=1 / 


http://localhost:/solr/collection1/suggest?suggest=truesuggest.build=truesuggest.dictionary=mySuggesterwt=jsonsuggest.q=kern
 

{responseHeader:{status:0,QTime:4},command:build,suggest:{mySuggester:{kern:{numFound:0,suggestions:[]
 


Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts

Thanks.  I am looking at the RSS DIH example right now.


On 1/21/15, 3:15 PM, Alexandre Rafalovitch wrote:

Solr is just fine for this.

It even ships with an example of how to read an RSS file under the DIH
directory. DIH is also most likely what you will use for the first
implementation. Don't need to worry about Stax or anything, unless
your file format is very weird or has overlapping namespaces (DIH XML
parser does not care about namespaces).

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 21 January 2015 at 14:53, Carl Roberts carl.roberts.zap...@gmail.com wrote:

Hi,

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several elements in
each node that I have to index, so I was planning to parse the XML with Stax
and extract the data from each node and add it to Solr.  There will always
be only one one file to start with and then a second file as the RSS feeds
supplies updates.  I want to return certain fields of each node when I
search certain fields of the same node.  Is Solr overkill in this case?
Should I just use Lucene instead?

Regards,

Joe




Re: Is Solr a good candidate to index 100s of nodes in one XML file?

2015-01-22 Thread Carl Roberts
Thanks for the input.  I think one benefit of using Solr is also that I 
can provide a REST API to search the indexed records.


Regards,

Joe
On 1/21/15, 3:17 PM, Shawn Heisey wrote:

On 1/21/2015 12:53 PM, Carl Roberts wrote:

Is Solr a good candidate to index 100s of nodes in one XML file?

I have an RSS feed XML file that has 100s of nodes with several
elements in each node that I have to index, so I was planning to parse
the XML with Stax and extract the data from each node and add it to
Solr.  There will always be only one one file to start with and then a
second file as the RSS feeds supplies updates.  I want to return
certain fields of each node when I search certain fields of the same
node.  Is Solr overkill in this case?  Should I just use Lucene instead?

Effectively, Solr *is* Lucene.  You edit configuration files instead of
writing Lucene code, because Solr is a fully customizable search server,
not a programming API.  That also means that it's not as flexible as
Lucene ... but it's a lot easier.

If you're capable of writing Lucene code, chances are that you'll be
able to write an application that is highly tailored to your situation
that will have better performance than Solr ... but you'll be writing
the entire program yourself.  Solr lets you install an existing program
and just change the configuration.

Thanks,
Shawn





Is there a way to pass in proxy settings to Solr?

2015-01-22 Thread Carl Roberts

Hi,

Is there a way to pass in proxy settings to Solr?

The reason that I am asking this question is that I am trying to run the 
DIH RSS example, and it is not working when I try to import the RSS feed 
URL because the code in Solr comes back with an unknown host exception 
due to the proxy that we use at work.


If I use the curl tool and the environment variable http_proxy to access 
the RSS feed directly it works, but it appears Solr does not use that 
environment variable because it is throwing this error:


39642 [Thread-15] ERROR org.apache.solr.handler.dataimport.URLDataSource 
– Exception thrown while getting data

java.net.UnknownHostException: rss.slashdot.org
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.init(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932)
at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850)
at 
sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1300)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:98)
at 
org.apache.solr.handler.dataimport.URLDataSource.getData(URLDataSource.java:42)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:283)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:224)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:204)
at 
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:243)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:476)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)


Thanks in advance,

Joe



Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Jorge Luis Betancourt González
Hello all,

Currently we are using edismax query parser in an internal application, we've 
detected that some wildcard queries including * are causing some performance 
issues and for this particular case we're not interested in allowing any user 
to request all the indexed documents. 

This could be easily escaped in the application level, but right now we have 
several applications (using several programming languages) consuming from Solr, 
and adding this into each application is kind of exhausting, so I'm wondering 
if there is some configuration that allow us to treat this special characters 
as normal alphanumeric characters. 

I've tried one solution that worked before, involving the WordDelimiterFilter 
an the types attribute:

filter class=solr.WordDelimiterFilterFactory generateWordParts=0 
generateNumberParts=0 catenateWords=0 
catenateNumbers=0 catenateAll=0 splitOnCaseChange=0 preserveOriginal=0 
types=characters.txt /

and in characters.txt I've mapped the special characters into ALPHA:

+ = ALPHA 
* = ALPHA 

Any thoughts on this?


---
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 
12 años de historia junto a Fidel. 12 de diciembre de 2014.



Re: Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Alexandre Rafalovitch
I suspect the special characters get caught before the analyzer chains.

But what about pre-pending a custom search components?

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 22 January 2015 at 16:33, Jorge Luis Betancourt González
jlbetanco...@uci.cu wrote:
 Hello all,

 Currently we are using edismax query parser in an internal application, we've 
 detected that some wildcard queries including * are causing some 
 performance issues and for this particular case we're not interested in 
 allowing any user to request all the indexed documents.

 This could be easily escaped in the application level, but right now we have 
 several applications (using several programming languages) consuming from 
 Solr, and adding this into each application is kind of exhausting, so I'm 
 wondering if there is some configuration that allow us to treat this special 
 characters as normal alphanumeric characters.

 I've tried one solution that worked before, involving the WordDelimiterFilter 
 an the types attribute:

 filter class=solr.WordDelimiterFilterFactory generateWordParts=0 
 generateNumberParts=0 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0 
 preserveOriginal=0 types=characters.txt /

 and in characters.txt I've mapped the special characters into ALPHA:

 + = ALPHA
 * = ALPHA

 Any thoughts on this?


 ---
 XII Aniversario de la creación de la Universidad de las Ciencias 
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.



Re: Field collapsing memory usage

2015-01-22 Thread Erick Erickson
Toke:

What do you think about folding this into the Solr (or Lucene?) code
base? Or is it to specialized?

Not sure one way or the other, just askin'

Erick

On Thu, Jan 22, 2015 at 3:47 AM, Toke Eskildsen t...@statsbiblioteket.dk 
wrote:
 Norgorn [lsunnyd...@mail.ru] wrote:
 Is there any way to make 'docValues=true' without reindexing?

 Depends on how brave you are :-)

 We recently had the same need and made 
 https://github.com/netarchivesuite/dvenabler
 To my knowledge that is the only existing tool for that task an as we are the 
 only ones having used it, robustness is not guaranteed. Warnings aside, it 
 works without problems in our tests as well as the few real corpuses we have 
 tested on. It does use a fairly memory hungry structure during the 
 conversion. If the number of _unique_ values in your grouping field 
 approaches 1b, I loosely guess that you will need 40GB+ of heap. Do read 
 https://github.com/netarchivesuite/dvenabler/issues/14 if you want to try it.

 - Toke Eskildsen


Re: Retrieving Phonetic Code as result

2015-01-22 Thread Erik Hatcher
Faceting returns indexed terms.  So adding 
facet=onfacet.field=firstname_phonetic will get you back the phonetic codes 
across an entire result set.

If you have a single string and want the phonetic codes back, you can use the 
analysis request handler (document or field).

For a bit more detail, check out the files I added to this JIRA: 
https://issues.apache.org/jira/browse/SOLR-3551 
https://issues.apache.org/jira/browse/SOLR-3551

Erik


 On Jan 22, 2015, at 6:25 AM, Amit Jha shanuu@gmail.com wrote:
 
 Thanks for response, I can see generated MetaPhone codes using Luke. I am
 using solr only because it creates the phonetic code at time of indexing.
 Otherwise for each record I need to call Metaphone algorithm in realtime to
 get the codes and compare them. I think when luke can read and display it,
 why can't solr
 
 On Thu, Jan 22, 2015 at 7:54 PM, Amit Jha shanuu@gmail.com wrote:
 
 Hi,
 
 I need to know how can I retrieve phonetic codes. Does solr provide it as
 part of result? I need codes for record matching.
 
 *following is schema fragment:*
 
 fieldtype name=phonetic stored=true indexed=true
 class=solr.TextField 
  analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.DoubleMetaphoneFilterFactory inject=true
 maxCodeLength=4/
  /analyzer
/fieldtype
 
 field name=firstname type=text_general indexed=true stored=true/
  field name=firstname_phonetic type=phonetic /
  field name=lastname_phonetic type=phonetic /
  field name=lastname type=text_general indexed=true stored=true/
 
 copyField source=lastname dest=lastname_phonetic/
 copyField source=firstname dest=firstname_phonetic/
 
 Hi,
 
 Thanks for response, I can see generated MetaPhone codes using Luke. I am
 using solr only because it creates the phonetic code at time of indexing.
 Otherwise for each record I need to call Metaphone algorithm in realtime to
 get the codes and compare them. I think when luke can read and display it,
 why can't solr?
 
 



Re: Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Jack Krupansky
The problem is that the presence of a wildcard causes Solr to skip the
usual token analysis. But... you could add a multiterm analyzer, and then
the wildcard would just get treated as punctuation.

-- Jack Krupansky

On Thu, Jan 22, 2015 at 4:33 PM, Jorge Luis Betancourt González 
jlbetanco...@uci.cu wrote:

 Hello all,

 Currently we are using edismax query parser in an internal application,
 we've detected that some wildcard queries including * are causing some
 performance issues and for this particular case we're not interested in
 allowing any user to request all the indexed documents.

 This could be easily escaped in the application level, but right now we
 have several applications (using several programming languages) consuming
 from Solr, and adding this into each application is kind of exhausting, so
 I'm wondering if there is some configuration that allow us to treat this
 special characters as normal alphanumeric characters.

 I've tried one solution that worked before, involving the
 WordDelimiterFilter an the types attribute:

 filter class=solr.WordDelimiterFilterFactory generateWordParts=0
 generateNumberParts=0 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0
 preserveOriginal=0 types=characters.txt /

 and in characters.txt I've mapped the special characters into ALPHA:

 + = ALPHA
 * = ALPHA

 Any thoughts on this?


 ---
 XII Aniversario de la creación de la Universidad de las Ciencias
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.




Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts

Hi Walter,

If I try this from my Mac shell:

curl 
http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary:Oracle 
Fusion


I don't get a response.

If I try this, it works!:

curl 
http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=name:Oracle;


So I think the entire curl url needs to be in quotes in the command line 
and my problem is that I do not know how to put the url in quotes and 
then the field value in quotes inside that.


BTW - If I try the first URL from a browser, it works just fine.

Any suggestions?

On 1/22/15, 5:54 PM, Walter Underwood wrote:

Your query is this:

summary:Oracle Fusion Middleware

That searches for “Oracle” in the summary field and “Fusion” and “Middleware” 
in whatever your default field is.

You want:

summary:”Oracle Fusion Middleware”

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Jan 22, 2015, at 2:47 PM, Carl Roberts carl.roberts.zap...@gmail.com wrote:


Hi,

How do you query a sentence composed of multiple words in a description field?

I want to search for sentence Oracle Fusion Middleware but when I try the 
following search query in curl, I get nothing:

curl http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion 
Middlewarewt=xmlindent=true

If I actually try using Oracle+Fusion+Middleware I get hits with Oracle or Fusion or 
Middleware but not just the ones with the string Oracle Fusion Middleware.

This is the response:

?xml version=1.0 encoding=UTF-8?
response

lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  lst name=params
str name=indenttrue/str
str name=qsummary:Oracle Fusion Middleware/str
str name=wtxml/str
  /lst
/lst
result name=response numFound=128 start=0
  doc
str name=idCVE-2014-6526/str
str name=summaryUnspecified vulnerability in the Oracle Directory Server 
Enterprise Edition component in Oracle Fusion Middleware 7.0 allows remote attackers to affect 
integrity via unknown vectors related to Admin Console./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6526/str
long name=_version_1491039690408591361/long/doc
  doc
str name=idCVE-2014-6548/str
str name=summaryUnspecified vulnerability in the Oracle SOA Suite component 
in Oracle Fusion Middleware 11.1.1.7 allows local users to affect confidentiality, integrity, and 
availability via vectors related to B2B Engine./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6548/str
long name=_version_1491039690410688513/long/doc
  doc
str name=idCVE-2014-6580/str
str name=summaryUnspecified vulnerability in the Oracle Reports Developer 
component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote attackers to affect 
integrity via unknown vectors./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6580/str
long name=_version_149103969042432/long/doc
  doc
str name=idCVE-2014-6594/str
str name=summaryUnspecified vulnerability in the Oracle iLearning component 
in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect confidentiality via unknown vectors 
related to Learner Pages./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6594/str
long name=_version_1491039690435854337/long/doc
  doc
str name=idCVE-2015-0372/str
str name=summaryUnspecified vulnerability in the Oracle Containers for J2EE 
component in Oracle Fusion Middleware 10.1.3.5 allows remote attackers to affect confidentiality via 
unknown vectors./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0372/str
long name=_version_1491039690456825857/long/doc
  doc
str name=idCVE-2015-0376/str
str name=summaryUnspecified vulnerability in the Oracle WebCenter Content 
component in Oracle Fusion Middleware 11.1.1.8.0 allows remote attackers to affect integrity via 
unknown vectors related to Content Server./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0376/str
long name=_version_1491039690458923008/long/doc
  doc
str name=idCVE-2015-0420/str
str name=summaryUnspecified vulnerability in the Oracle Forms component in 
Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote attackers to affect confidentiality via 
unknown vectors related to Forms Services./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0420/str
long name=_version_1491039690481991681/long/doc
  doc
str name=idCVE-2015-0436/str
str name=summaryUnspecified vulnerability in the Oracle iLearning component 
in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect confidentiality via unknown vectors 
related to Login./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0436/str
long name=_version_1491039690488283137/long/doc
  doc
str name=idCVE-2014-6525/str
str name=summaryUnspecified vulnerability 

In a SolrCloud, will a solr core(shard replica) failover to its good peer when its state is not Active

2015-01-22 Thread 汤林
A solr core have several state, besides Active, there are Recovering,
Down, Recovery failed and Gone.
I know when the state is Recovering, the query or index request can be
failover to its leader(the good one), but I'm not sure other state,
especially the Down state at the solr server just starting period.

Could anyone help to confirm? Thanks!


Re: zk disconnects and failure to retry?

2015-01-22 Thread Erick Erickson
Oh yes, lots in the past 8 months, the JIRAs can give details.

Best,
Erick

On Thu, Jan 22, 2015 at 4:10 PM, deniz denizdurmu...@gmail.com wrote:

 bumping an old entry... but are there any improvements on this issue?



 -
 Zeki ama calismiyor... Calissa yapar...
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/zk-disconnects-and-failure-to-retry-tp4065877p4181370.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: In a SolrCloud, will a solr core(shard replica) failover to its good peer when its state is not Active

2015-01-22 Thread Erick Erickson
As long as one replica for each shard is active, you should be able to
query the collection.

You an also index to the collection and it'll all just work, when the
replicas that are not active become active they'll get the updates and
catch up to the leader. This process may take quite some time so it is
probably best, if you have a choice, to turn indexing on after all the
replicas are up and running. This is not a requirement, however.

Best,
Erick

On Thu, Jan 22, 2015 at 6:10 PM, 汤林 tanglin0...@gmail.com wrote:

 A solr core have several state, besides Active, there are Recovering,
 Down, Recovery failed and Gone.
 I know when the state is Recovering, the query or index request can be
 failover to its leader(the good one), but I'm not sure other state,
 especially the Down state at the solr server just starting period.

 Could anyone help to confirm? Thanks!



Re: In a SolrCloud, will a solr core(shard replica) failover to its good peer when its state is not Active

2015-01-22 Thread 汤林
Thanks, Erick.

You are right. My question is : When a Solr server is running, but a
core(shard replica) on it is NOT Active, for example, Down, will the
query request to it be failed over to the good replica of the same shard?
Thanks!

2015-01-23 10:26 GMT+08:00 Erick Erickson erickerick...@gmail.com:

 As long as one replica for each shard is active, you should be able to
 query the collection.

 You an also index to the collection and it'll all just work, when the
 replicas that are not active become active they'll get the updates and
 catch up to the leader. This process may take quite some time so it is
 probably best, if you have a choice, to turn indexing on after all the
 replicas are up and running. This is not a requirement, however.

 Best,
 Erick

 On Thu, Jan 22, 2015 at 6:10 PM, 汤林 tanglin0...@gmail.com wrote:

 A solr core have several state, besides Active, there are Recovering,
 Down, Recovery failed and Gone.
 I know when the state is Recovering, the query or index request can be
 failover to its leader(the good one), but I'm not sure other state,
 especially the Down state at the solr server just starting period.

 Could anyone help to confirm? Thanks!





Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Jack Krupansky
It appears that you are actually intending to query a phrase rather than a
complete sentence. The former are easy - just enclose the phrase in quotes.

Fielded query applies to a single term, a quoted phrase, or a parenthesized
sub-query, so in your query it applied only to that first term, so Solr
tried to find the remaining terms in the default query field.

-- Jack Krupansky

On Thu, Jan 22, 2015 at 5:47 PM, Carl Roberts carl.roberts.zap...@gmail.com
 wrote:

 Hi,

 How do you query a sentence composed of multiple words in a description
 field?

 I want to search for sentence Oracle Fusion Middleware but when I try
 the following search query in curl, I get nothing:

 curl http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion
 Middlewarewt=xmlindent=true

 If I actually try using Oracle+Fusion+Middleware I get hits with Oracle
 or Fusion or Middleware but not just the ones with the string Oracle
 Fusion Middleware.

 This is the response:

 ?xml version=1.0 encoding=UTF-8?
 response

 lst name=responseHeader
   int name=status0/int
   int name=QTime1/int
   lst name=params
 str name=indenttrue/str
 str name=qsummary:Oracle Fusion Middleware/str
 str name=wtxml/str
   /lst
 /lst
 result name=response numFound=128 start=0
   doc
 str name=idCVE-2014-6526/str
 str name=summaryUnspecified vulnerability in the Oracle Directory
 Server Enterprise Edition component in Oracle Fusion Middleware 7.0 allows
 remote attackers to affect integrity via unknown vectors related to Admin
 Console./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2014-6526/str
 long name=_version_1491039690408591361/long/doc
   doc
 str name=idCVE-2014-6548/str
 str name=summaryUnspecified vulnerability in the Oracle SOA Suite
 component in Oracle Fusion Middleware 11.1.1.7 allows local users to affect
 confidentiality, integrity, and availability via vectors related to B2B
 Engine./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2014-6548/str
 long name=_version_1491039690410688513/long/doc
   doc
 str name=idCVE-2014-6580/str
 str name=summaryUnspecified vulnerability in the Oracle Reports
 Developer component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2
 allows remote attackers to affect integrity via unknown vectors./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2014-6580/str
 long name=_version_149103969042432/long/doc
   doc
 str name=idCVE-2014-6594/str
 str name=summaryUnspecified vulnerability in the Oracle iLearning
 component in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect
 confidentiality via unknown vectors related to Learner Pages./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2014-6594/str
 long name=_version_1491039690435854337/long/doc
   doc
 str name=idCVE-2015-0372/str
 str name=summaryUnspecified vulnerability in the Oracle Containers
 for J2EE component in Oracle Fusion Middleware 10.1.3.5 allows remote
 attackers to affect confidentiality via unknown vectors./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2015-0372/str
 long name=_version_1491039690456825857/long/doc
   doc
 str name=idCVE-2015-0376/str
 str name=summaryUnspecified vulnerability in the Oracle WebCenter
 Content component in Oracle Fusion Middleware 11.1.1.8.0 allows remote
 attackers to affect integrity via unknown vectors related to Content
 Server./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2015-0376/str
 long name=_version_1491039690458923008/long/doc
   doc
 str name=idCVE-2015-0420/str
 str name=summaryUnspecified vulnerability in the Oracle Forms
 component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote
 attackers to affect confidentiality via unknown vectors related to Forms
 Services./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2015-0420/str
 long name=_version_1491039690481991681/long/doc
   doc
 str name=idCVE-2015-0436/str
 str name=summaryUnspecified vulnerability in the Oracle iLearning
 component in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect
 confidentiality via unknown vectors related to Login./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2015-0436/str
 long name=_version_1491039690488283137/long/doc
   doc
 str name=idCVE-2014-6525/str
 str name=summaryUnspecified vulnerability in the Oracle Web
 Applications Desktop Integrator component in Oracle E-Business Suite
 11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote
 authenticated users to affect integrity via unknown vectors related to
 Templates./str
 str name=linkhttp://web.nvd.nist.gov/view/vuln/detail?
 vulnId=CVE-2014-6525/str
 long name=_version_1491039690408591360/long/doc
   doc
 str name=idCVE-2014-6556/str
 str 

Re: Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Jack Krupansky
The dismax query parser does not support wildcards. It is designed to be
simpler.

-- Jack Krupansky

On Thu, Jan 22, 2015 at 5:57 PM, Jorge Luis Betancourt González 
jlbetanco...@uci.cu wrote:

 I was also suspecting something like that, the odd thing was that the with
 the dismax parser this seems to work, I mean passing a single * in the
 query just like:


 http://localhost:8983/solr/collection1/select?q=*wt=jsonindent=truedefType=dismax

 Returns:

 {
   responseHeader:{
 status:0,
 QTime:3},
   response:{numFound:0,start:0,docs:[]
   },
   highlighting:{}
 }

 Which is consisten with no * term indexed.

 Based on what I saw with dismax, I though that perhaps a configuration
 option existed to accomplish the same with the edismax query parser, but I
 haven't found such option.

 I'm going to test with a custom search component.

 Thanks for the quick response Alex,

 Regards,

 - Original Message -
 From: Alexandre Rafalovitch arafa...@gmail.com
 To: solr-user solr-user@lucene.apache.org
 Sent: Thursday, January 22, 2015 4:46:08 PM
 Subject: Re: Avoiding wildcard queries using edismax query parser

 I suspect the special characters get caught before the analyzer chains.

 But what about pre-pending a custom search components?

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 22 January 2015 at 16:33, Jorge Luis Betancourt González
 jlbetanco...@uci.cu wrote:
  Hello all,
 
  Currently we are using edismax query parser in an internal application,
 we've detected that some wildcard queries including * are causing some
 performance issues and for this particular case we're not interested in
 allowing any user to request all the indexed documents.
 
  This could be easily escaped in the application level, but right now we
 have several applications (using several programming languages) consuming
 from Solr, and adding this into each application is kind of exhausting, so
 I'm wondering if there is some configuration that allow us to treat this
 special characters as normal alphanumeric characters.
 
  I've tried one solution that worked before, involving the
 WordDelimiterFilter an the types attribute:
 
  filter class=solr.WordDelimiterFilterFactory generateWordParts=0
 generateNumberParts=0 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=0
 preserveOriginal=0 types=characters.txt /
 
  and in characters.txt I've mapped the special characters into ALPHA:
 
  + = ALPHA
  * = ALPHA
 
  Any thoughts on this?
 
 
  ---
  XII Aniversario de la creación de la Universidad de las Ciencias
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
 


 ---
 XII Aniversario de la creación de la Universidad de las Ciencias
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.




Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Erick Erickson
Have you considered using the admin/query form? Lots of escaping is done
there for you. Once you have the form of the query down and know what to
expect, it's probably easier to enter escaping hell with curl and the
like

And what is your schema definition for the field in question? the
admin/analysis page can help a lot here.

Best,
Erick

On Thu, Jan 22, 2015 at 3:51 PM, Carl Roberts carl.roberts.zap...@gmail.com
 wrote:

 Thanks Shawn - I tried this but it does not work.  I don't even get a
 response from curl when I try that format and when I look at the logging on
 the console for Jetty I don't see anything new - it seems that the request
 is not even making it to the server.



 On 1/22/15, 6:43 PM, Shawn Heisey wrote:

 On 1/22/2015 4:31 PM, Carl Roberts wrote:

 Hi Walter,

 If I try this from my Mac shell:

  curl
 http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary
 :Oracle
 Fusion

 I don't get a response.

 Quotes are a special character to the shell on your mac, and get removed
 from what the curl command sees.  You'll need to put the whole thing in
 quotes (so that characters like  are not interpreted by the shell) and
 then escape the quotes that you want to actually be handled by curl:

 curl
 http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary
 :\Oracle
 Fusion\

 Thanks,
 Shawn





Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Walter Underwood
Your query is this:

summary:Oracle Fusion Middleware

That searches for “Oracle” in the summary field and “Fusion” and “Middleware” 
in whatever your default field is.

You want:

summary:”Oracle Fusion Middleware”

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/


On Jan 22, 2015, at 2:47 PM, Carl Roberts carl.roberts.zap...@gmail.com wrote:

 Hi,
 
 How do you query a sentence composed of multiple words in a description field?
 
 I want to search for sentence Oracle Fusion Middleware but when I try the 
 following search query in curl, I get nothing:
 
 curl http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion 
 Middlewarewt=xmlindent=true
 
 If I actually try using Oracle+Fusion+Middleware I get hits with Oracle or 
 Fusion or Middleware but not just the ones with the string Oracle Fusion 
 Middleware.
 
 This is the response:
 
 ?xml version=1.0 encoding=UTF-8?
 response
 
 lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  lst name=params
str name=indenttrue/str
str name=qsummary:Oracle Fusion Middleware/str
str name=wtxml/str
  /lst
 /lst
 result name=response numFound=128 start=0
  doc
str name=idCVE-2014-6526/str
str name=summaryUnspecified vulnerability in the Oracle Directory 
 Server Enterprise Edition component in Oracle Fusion Middleware 7.0 allows 
 remote attackers to affect integrity via unknown vectors related to Admin 
 Console./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6526/str
long name=_version_1491039690408591361/long/doc
  doc
str name=idCVE-2014-6548/str
str name=summaryUnspecified vulnerability in the Oracle SOA Suite 
 component in Oracle Fusion Middleware 11.1.1.7 allows local users to affect 
 confidentiality, integrity, and availability via vectors related to B2B 
 Engine./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6548/str
long name=_version_1491039690410688513/long/doc
  doc
str name=idCVE-2014-6580/str
str name=summaryUnspecified vulnerability in the Oracle Reports 
 Developer component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows 
 remote attackers to affect integrity via unknown vectors./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6580/str
long name=_version_149103969042432/long/doc
  doc
str name=idCVE-2014-6594/str
str name=summaryUnspecified vulnerability in the Oracle iLearning 
 component in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect 
 confidentiality via unknown vectors related to Learner Pages./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6594/str
long name=_version_1491039690435854337/long/doc
  doc
str name=idCVE-2015-0372/str
str name=summaryUnspecified vulnerability in the Oracle Containers for 
 J2EE component in Oracle Fusion Middleware 10.1.3.5 allows remote attackers 
 to affect confidentiality via unknown vectors./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0372/str
long name=_version_1491039690456825857/long/doc
  doc
str name=idCVE-2015-0376/str
str name=summaryUnspecified vulnerability in the Oracle WebCenter 
 Content component in Oracle Fusion Middleware 11.1.1.8.0 allows remote 
 attackers to affect integrity via unknown vectors related to Content 
 Server./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0376/str
long name=_version_1491039690458923008/long/doc
  doc
str name=idCVE-2015-0420/str
str name=summaryUnspecified vulnerability in the Oracle Forms 
 component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows remote 
 attackers to affect confidentiality via unknown vectors related to Forms 
 Services./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0420/str
long name=_version_1491039690481991681/long/doc
  doc
str name=idCVE-2015-0436/str
str name=summaryUnspecified vulnerability in the Oracle iLearning 
 component in Oracle iLearning 6.0 and 6.1 allows remote attackers to affect 
 confidentiality via unknown vectors related to Login./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0436/str
long name=_version_1491039690488283137/long/doc
  doc
str name=idCVE-2014-6525/str
str name=summaryUnspecified vulnerability in the Oracle Web 
 Applications Desktop Integrator component in Oracle E-Business Suite 
 11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote 
 authenticated users to affect integrity via unknown vectors related to 
 Templates./str
str 
 name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6525/str
long name=_version_1491039690408591360/long/doc
  doc
str name=idCVE-2014-6556/str
str name=summaryUnspecified vulnerability in the Oracle Applications 
 DBA component in Oracle E-Business Suite 

Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Shawn Heisey
On 1/22/2015 4:31 PM, Carl Roberts wrote:
 Hi Walter,

 If I try this from my Mac shell:

 curl
 http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary:Oracle
 Fusion

 I don't get a response.

Quotes are a special character to the shell on your mac, and get removed
from what the curl command sees.  You'll need to put the whole thing in
quotes (so that characters like  are not interpreted by the shell) and
then escape the quotes that you want to actually be handled by curl:

curl
http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary:\Oracle
Fusion\

Thanks,
Shawn



Re: How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts
Thanks Shawn - I tried this but it does not work.  I don't even get a 
response from curl when I try that format and when I look at the logging 
on the console for Jetty I don't see anything new - it seems that the 
request is not even making it to the server.



On 1/22/15, 6:43 PM, Shawn Heisey wrote:

On 1/22/2015 4:31 PM, Carl Roberts wrote:

Hi Walter,

If I try this from my Mac shell:

 curl
http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary:Oracle
Fusion

I don't get a response.

Quotes are a special character to the shell on your mac, and get removed
from what the curl command sees.  You'll need to put the whole thing in
quotes (so that characters like  are not interpreted by the shell) and
then escape the quotes that you want to actually be handled by curl:

curl
http://localhost:8983/solr/nvd-rss/select?wt=jsonindent=trueq=summary:\Oracle
Fusion\

Thanks,
Shawn





How do you query a sentence composed of multiple words in a description field?

2015-01-22 Thread Carl Roberts

Hi,

How do you query a sentence composed of multiple words in a description 
field?


I want to search for sentence Oracle Fusion Middleware but when I try 
the following search query in curl, I get nothing:


curl http://localhost:8983/solr/nvd-rss/select?q=summary:Oracle Fusion 
Middlewarewt=xmlindent=true


If I actually try using Oracle+Fusion+Middleware I get hits with 
Oracle or Fusion or Middleware but not just the ones with the string 
Oracle Fusion Middleware.


This is the response:

?xml version=1.0 encoding=UTF-8?
response

lst name=responseHeader
  int name=status0/int
  int name=QTime1/int
  lst name=params
str name=indenttrue/str
str name=qsummary:Oracle Fusion Middleware/str
str name=wtxml/str
  /lst
/lst
result name=response numFound=128 start=0
  doc
str name=idCVE-2014-6526/str
str name=summaryUnspecified vulnerability in the Oracle 
Directory Server Enterprise Edition component in Oracle Fusion 
Middleware 7.0 allows remote attackers to affect integrity via unknown 
vectors related to Admin Console./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6526/str

long name=_version_1491039690408591361/long/doc
  doc
str name=idCVE-2014-6548/str
str name=summaryUnspecified vulnerability in the Oracle SOA 
Suite component in Oracle Fusion Middleware 11.1.1.7 allows local users 
to affect confidentiality, integrity, and availability via vectors 
related to B2B Engine./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6548/str

long name=_version_1491039690410688513/long/doc
  doc
str name=idCVE-2014-6580/str
str name=summaryUnspecified vulnerability in the Oracle Reports 
Developer component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 
allows remote attackers to affect integrity via unknown vectors./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6580/str

long name=_version_149103969042432/long/doc
  doc
str name=idCVE-2014-6594/str
str name=summaryUnspecified vulnerability in the Oracle 
iLearning component in Oracle iLearning 6.0 and 6.1 allows remote 
attackers to affect confidentiality via unknown vectors related to 
Learner Pages./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6594/str

long name=_version_1491039690435854337/long/doc
  doc
str name=idCVE-2015-0372/str
str name=summaryUnspecified vulnerability in the Oracle 
Containers for J2EE component in Oracle Fusion Middleware 10.1.3.5 
allows remote attackers to affect confidentiality via unknown vectors./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0372/str

long name=_version_1491039690456825857/long/doc
  doc
str name=idCVE-2015-0376/str
str name=summaryUnspecified vulnerability in the Oracle 
WebCenter Content component in Oracle Fusion Middleware 11.1.1.8.0 
allows remote attackers to affect integrity via unknown vectors related 
to Content Server./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0376/str

long name=_version_1491039690458923008/long/doc
  doc
str name=idCVE-2015-0420/str
str name=summaryUnspecified vulnerability in the Oracle Forms 
component in Oracle Fusion Middleware 11.1.1.7 and 11.1.2.2 allows 
remote attackers to affect confidentiality via unknown vectors related 
to Forms Services./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0420/str

long name=_version_1491039690481991681/long/doc
  doc
str name=idCVE-2015-0436/str
str name=summaryUnspecified vulnerability in the Oracle 
iLearning component in Oracle iLearning 6.0 and 6.1 allows remote 
attackers to affect confidentiality via unknown vectors related to 
Login./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2015-0436/str

long name=_version_1491039690488283137/long/doc
  doc
str name=idCVE-2014-6525/str
str name=summaryUnspecified vulnerability in the Oracle Web 
Applications Desktop Integrator component in Oracle E-Business Suite 
11.5.10.2, 12.0.6, 12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote 
authenticated users to affect integrity via unknown vectors related to 
Templates./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6525/str

long name=_version_1491039690408591360/long/doc
  doc
str name=idCVE-2014-6556/str
str name=summaryUnspecified vulnerability in the Oracle 
Applications DBA component in Oracle E-Business Suite 11.5.10.2, 12.0.6, 
12.1.3, 12.2.2, 12.2.3, and 12.2.4 allows remote authenticated users to 
affect confidentiality, integrity, and availability via vectors related 
to AD_DDL./str
str 
name=linkhttp://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2014-6556/str

long name=_version_1491039690412785664/long/doc
/result
/response


Re: Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Jorge Luis Betancourt González
I was also suspecting something like that, the odd thing was that the with the 
dismax parser this seems to work, I mean passing a single * in the query just 
like:

http://localhost:8983/solr/collection1/select?q=*wt=jsonindent=truedefType=dismax

Returns:

{
  responseHeader:{
status:0,
QTime:3},
  response:{numFound:0,start:0,docs:[]
  },
  highlighting:{}
}

Which is consisten with no * term indexed. 

Based on what I saw with dismax, I though that perhaps a configuration option 
existed to accomplish the same with the edismax query parser, but I haven't 
found such option. 

I'm going to test with a custom search component. 

Thanks for the quick response Alex,

Regards,

- Original Message -
From: Alexandre Rafalovitch arafa...@gmail.com
To: solr-user solr-user@lucene.apache.org
Sent: Thursday, January 22, 2015 4:46:08 PM
Subject: Re: Avoiding wildcard queries using edismax query parser

I suspect the special characters get caught before the analyzer chains.

But what about pre-pending a custom search components?

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 22 January 2015 at 16:33, Jorge Luis Betancourt González
jlbetanco...@uci.cu wrote:
 Hello all,

 Currently we are using edismax query parser in an internal application, we've 
 detected that some wildcard queries including * are causing some 
 performance issues and for this particular case we're not interested in 
 allowing any user to request all the indexed documents.

 This could be easily escaped in the application level, but right now we have 
 several applications (using several programming languages) consuming from 
 Solr, and adding this into each application is kind of exhausting, so I'm 
 wondering if there is some configuration that allow us to treat this special 
 characters as normal alphanumeric characters.

 I've tried one solution that worked before, involving the WordDelimiterFilter 
 an the types attribute:

 filter class=solr.WordDelimiterFilterFactory generateWordParts=0 
 generateNumberParts=0 catenateWords=0
 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0 
 preserveOriginal=0 types=characters.txt /

 and in characters.txt I've mapped the special characters into ALPHA:

 + = ALPHA
 * = ALPHA

 Any thoughts on this?


 ---
 XII Aniversario de la creación de la Universidad de las Ciencias 
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.



---
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 
12 años de historia junto a Fidel. 12 de diciembre de 2014.



Re: zk disconnects and failure to retry?

2015-01-22 Thread deniz
bumping an old entry... but are there any improvements on this issue?



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/zk-disconnects-and-failure-to-retry-tp4065877p4181370.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Avoiding wildcard queries using edismax query parser

2015-01-22 Thread Jorge Luis Betancourt González
Hi Jack!

Yes, that was my point, I was thinking that being edismax an extended version 
of dismas, perhaps had a switch to turn on/off this feature or putting some 
limits. I've tried the multiterm approach but with no luck, the * keeps being 
treated a match all query, as far as I can see from enabling debug output:

   rawquerystring: *,
   querystring: *,
   parsedquery: (+MatchAllDocsQuery(*:*) () 
FunctionQuery(1.0/(3.16E-11*float(ms(const(142198920),date(lastModified)))+1.0)))/no_coord,
  
The query gets translated into a MatchAllDocsQuery, which I think happens 
before the textual analysis.

- Original Message -
From: Jack Krupansky jack.krupan...@gmail.com
To: solr-user@lucene.apache.org
Sent: Friday, January 23, 2015 12:02:44 AM
Subject: Re: Avoiding wildcard queries using edismax query parser

The dismax query parser does not support wildcards. It is designed to be
simpler.

-- Jack Krupansky

On Thu, Jan 22, 2015 at 5:57 PM, Jorge Luis Betancourt González 
jlbetanco...@uci.cu wrote:

 I was also suspecting something like that, the odd thing was that the with
 the dismax parser this seems to work, I mean passing a single * in the
 query just like:


 http://localhost:8983/solr/collection1/select?q=*wt=jsonindent=truedefType=dismax

 Returns:

 {
   responseHeader:{
 status:0,
 QTime:3},
   response:{numFound:0,start:0,docs:[]
   },
   highlighting:{}
 }

 Which is consisten with no * term indexed.

 Based on what I saw with dismax, I though that perhaps a configuration
 option existed to accomplish the same with the edismax query parser, but I
 haven't found such option.

 I'm going to test with a custom search component.

 Thanks for the quick response Alex,

 Regards,

 - Original Message -
 From: Alexandre Rafalovitch arafa...@gmail.com
 To: solr-user solr-user@lucene.apache.org
 Sent: Thursday, January 22, 2015 4:46:08 PM
 Subject: Re: Avoiding wildcard queries using edismax query parser

 I suspect the special characters get caught before the analyzer chains.

 But what about pre-pending a custom search components?

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 22 January 2015 at 16:33, Jorge Luis Betancourt González
 jlbetanco...@uci.cu wrote:
  Hello all,
 
  Currently we are using edismax query parser in an internal application,
 we've detected that some wildcard queries including * are causing some
 performance issues and for this particular case we're not interested in
 allowing any user to request all the indexed documents.
 
  This could be easily escaped in the application level, but right now we
 have several applications (using several programming languages) consuming
 from Solr, and adding this into each application is kind of exhausting, so
 I'm wondering if there is some configuration that allow us to treat this
 special characters as normal alphanumeric characters.
 
  I've tried one solution that worked before, involving the
 WordDelimiterFilter an the types attribute:
 
  filter class=solr.WordDelimiterFilterFactory generateWordParts=0
 generateNumberParts=0 catenateWords=0
  catenateNumbers=0 catenateAll=0 splitOnCaseChange=0
 preserveOriginal=0 types=characters.txt /
 
  and in characters.txt I've mapped the special characters into ALPHA:
 
  + = ALPHA
  * = ALPHA
 
  Any thoughts on this?
 
 
  ---
  XII Aniversario de la creación de la Universidad de las Ciencias
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.
 


 ---
 XII Aniversario de la creación de la Universidad de las Ciencias
 Informáticas. 12 años de historia junto a Fidel. 12 de diciembre de 2014.




---
XII Aniversario de la creación de la Universidad de las Ciencias Informáticas. 
12 años de historia junto a Fidel. 12 de diciembre de 2014.



trying to get Apache Solr working with Dovecot.

2015-01-22 Thread Kevin Laurie
Hello,

I am desperately trying to get Apache Solr to work with Dovecot FTS. I
would really appreciate if someone could please help me!
I have already done the following:-


1. I can ssh into my server and see that Apache Solr is up and running.

 ssh -t -L 8983:localhost:8983 u...@mydomain.com


2. In the collection1 core selector I have the following files:-
solrconfig.xml and schema.xml
The schema.xml output is as follows(see link):-

http://pastebin.com/thGw2pQj

3. I have installed the dovecot-solr already.

4. Configured dovecot to run solr-fts as follows:-

In 10-mail.conf:

# Space separated list of plugins to load for all services. Plugins
specific to
# IMAP, LDA, etc. are added to this list in their own .conf files.
#mail_plugins =
mail_plugins = fts fts_solr

In 90-plugin.conf:-

plugin {
   fts = solr
   fts_solr = break-imap-search url=http://localhost:8983/solr/
}

Despite all the above the Solr does not seem to run FTS for Dovecot.
Appreciate if some one could give some feedback

Thanks
Kevin


Re: In a SolrCloud, will a solr core(shard replica) failover to its good peer when its state is not Active

2015-01-22 Thread 汤林
Thanks, Erick.

From a testing aspect, if we would like to verify the case that a query
request to a down core on a running server will be failed over to the
good core on another running server, is there any way to make a core as
down on a running server? Thanks!

We tried to change the /clusterstate.json in ZooKeeper to mark an active
core as down, but it seems only change the state in ZK, while the core
still functions in solr server.

2015-01-23 12:18 GMT+08:00 Erick Erickson erickerick...@gmail.com:

 In a word, yes. As far as querying is concerned, there is only Active
 and all other states, and
 requests are only routed to replicas in the Active state.

 Best,
 Erick

 On Thu, Jan 22, 2015 at 6:34 PM, 汤林 tanglin0...@gmail.com wrote:

 Thanks, Erick.

 You are right. My question is : When a Solr server is running, but a
 core(shard replica) on it is NOT Active, for example, Down, will the
 query request to it be failed over to the good replica of the same shard?
 Thanks!

 2015-01-23 10:26 GMT+08:00 Erick Erickson erickerick...@gmail.com:

 As long as one replica for each shard is active, you should be able to
 query the collection.

 You an also index to the collection and it'll all just work, when the
 replicas that are not active become active they'll get the updates and
 catch up to the leader. This process may take quite some time so it is
 probably best, if you have a choice, to turn indexing on after all the
 replicas are up and running. This is not a requirement, however.

 Best,
 Erick

 On Thu, Jan 22, 2015 at 6:10 PM, 汤林 tanglin0...@gmail.com wrote:

 A solr core have several state, besides Active, there are
 Recovering,
 Down, Recovery failed and Gone.
 I know when the state is Recovering, the query or index request can be
 failover to its leader(the good one), but I'm not sure other state,
 especially the Down state at the solr server just starting period.

 Could anyone help to confirm? Thanks!







Using tmpfs for Solr index

2015-01-22 Thread deniz
Would it boost any performance in case the index has been switched from
RAMDirectoryFactory to use tmpfs? Or it would simply do the same thing like
MMap? 

And in case it would be better to use tmpfs rather than RAMDirectory or
MMap, which directory factory would be the most feasible one for this
purpose?

Regards,



-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Using-tmpfs-for-Solr-index-tp4181399.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is there a way to pass in proxy settings to Solr?

2015-01-22 Thread Shawn Heisey
On 1/22/2015 9:18 AM, Carl Roberts wrote:
 Is there a way to pass in proxy settings to Solr?

 The reason that I am asking this question is that I am trying to run
 the DIH RSS example, and it is not working when I try to import the
 RSS feed URL because the code in Solr comes back with an unknown host
 exception due to the proxy that we use at work.

 If I use the curl tool and the environment variable http_proxy to
 access the RSS feed directly it works, but it appears Solr does not
 use that environment variable because it is throwing this error:

 39642 [Thread-15] ERROR
 org.apache.solr.handler.dataimport.URLDataSource – Exception thrown
 while getting data

Checking the code, URLDataSource seems to use the URL capability that
comes with Java itself.  The system properties on this page are very
likely to affect objects that come with Java:

http://docs.oracle.com/javase/7/docs/api/java/net/doc-files/net-properties.html#Proxies

You would need to set these properties on the java commandline that
starts your servlet container, with the -D option.

Thanks,
Shawn



Re: Solr 4.10.3 start up issue

2015-01-22 Thread Chris Hostetter

: had thought to do this before - and should have; I uploaded the full
: example collection configuration to ZK just now and tried again. Magic, it
: worked, which left me feeling a bit glum. Well, happy that it wasn't Solr.
: Now if you'll excuse me, I have a conf review to perform.

if your problem really is related to SOLR-6643, then you should be able to 
get more details by doing an explicit core creation using those config 
files *after* solr starts up (it's a quirk of where/how core loading is 
different on startup vs via the CoreAdminHandler - noted in the jira)

that may help you pinpoint the problem.

in theory: if the confgset is alread yin zk, then the CollectionAdmin 
CREATE commant will help you find hte same errors -- but my suggestion 
would be to keep it simple: single node solr, startup with no cores, then 
do core CREATE pointed at a directory with your configs and see what error 
you get.


: 
: Darren
: 
: On Wed, Jan 21, 2015 at 6:48 PM, Chris Hostetter hossman_luc...@fucit.org
: wrote:
: 
: 
:  : I posted a question on stackoverflow but in hindsight this would have
:  been
:  : a better place to start. Below is the link.
:  :
:  : Basically I can't get the example working when using an external ZK
:  cluster
:  : and auto-core discovery. Solr 4.10.1 works fine, but the newest release
: 
:  your SO URL shows the output of using your custom configs, but not what
:  you got with the example configs -- so it's not clear to me if there is
:  really just one problem, or perhaps 2?
: 
:  you also mentioned a lot of details about how you are using solr with zk,
:  and what doens't work, but it's not clear if you tried other simpler steps
:  using your configs -- or the example configs -- and if those simpler *did*
:  work (ie: single node solr startup?)
: 
:  my best guess, based on the logs you did post and the mention of
:  lib/mq/solr-search-ahead-2.0.0.jar in those logs, is that the entire
:  question of zk and slcuster state and leaders is a red herring, and what
:  you are running into is: SOLR-6643...
: 
:  https://issues.apache.org/jira/browse/SOLR-6643
: 
:  ...if i'm right, then simple core discovery with your configs on a single
:  node solr instance w/o any knowledge of ZK will also fail to init the core
:  -- and if you try to use the CoreAdmin API to CREATE a core, you'll ge
:  some kind of LinkageError.
: 
: 
: 
: 
:  : Here is the stackoverflow question, along with the full log output:
:  :
:  
http://stackoverflow.com/questions/28004832/solr-4-10-3-is-not-proceeding-to-leader-election-on-new-cluster-startup-hangs
: 
: 
:  -Hoss
:  http://www.lucidworks.com/
: 
: 
: 
: 
: -- 
: Darren
: 

-Hoss
http://www.lucidworks.com/


Re: Issue with Solr multiple sort

2015-01-22 Thread Erick Erickson
Shamik:

Nice job of including the relevant information and just the relevant info!

One addition to what Chris said that _may_ be relevant in future. The
string type
is totally unanalyzed, so sorting done on that field may be
case-sensitive, leading
to some confusion. If the schema has a lowercase type that may be better, it's
just KeywordTokenizerFactory and LowercaseFilterFactory as the analysis chain.

FWIW,
Erick

On Wed, Jan 21, 2015 at 5:25 PM, shamik sham...@gmail.com wrote:
 Thanks Hoss for clearing up my doubt. I was confused with the ordering. So I
 guess, the first field is always the primary sort field followed by
 secondary.

 Thanks again.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Issue-with-Solr-multiple-sort-tp4181056p4181062.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Recovery process

2015-01-22 Thread Erick Erickson
Shalin:

Just to see if my understanding is correct, how often would you expect 2 to
occur? My assumption so far is that it would be quite rare that the leader
and all replicas happened to hit autocommit points at the same time and thus it
would be save to just bring down a few segments. But that's
an assumption, I have no facts to back that up.

Nishanth:

Currently no, you can't configure the missed updates and still peer
sync. Getting
to the bottom of the connection resets seems indicated.

Best
Erick

On Wed, Jan 21, 2015 at 6:46 PM, Nishanth S nishanth.2...@gmail.com wrote:
 Thank you Shalin.So in a system where the indexing rate is more than 5K TPS
 or so the replica  will never be able to recover   through peer sync
 process.In  my case I have mostly seen  step 3 where a full copy happens
 and  if the index size is huge it takes a very long time for replicas to
 recover.Is there a way we can  configure the  number of missed updates for
 peer sync.

 Thanks,
 Nishanth

 On Wed, Jan 21, 2015 at 4:47 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

 Hi Nishanth,

 The recovery happens as follows:

 1. PeerSync is attempted first. If the number of new updates on leader is
 less than 100 then the missing documents are fetched directly and indexed
 locally. The tlog tells us the last 100 updates very quickly. Other uses of
 the tlog are for durability of updates and of course, startup recovery.
 2. If the above step fails then replication recovery is attempted. A hard
 commit is called on the leader and then the leader is polled for the latest
 index version and generation. If the leader's version and generation are
 greater than local index's version/generation then the difference of the
 index files between leader and replica are fetched and installed.
 3. If the above fails (because leader's version/generation is somehow equal
 or more than local) then a full index recovery happens and the entire index
 from the leader is fetched and installed locally.

 There are some other details involved in this process too but probably not
 worth going into here.

 On Wed, Jan 21, 2015 at 5:13 PM, Nishanth S nishanth.2...@gmail.com
 wrote:

  Hello Everyone,
 
  I am hitting a few issues with solr replicas going into recovery and then
  doing a full index copy.I am trying to understand the solr recovery
  process.I have read a few blogs  on this and saw  that when leader
 notifies
  a replica to  recover(in my case it is due to connection resets) it will
  try to do a peer sync first and  if the missed updates are more than 100
 it
  will do a full index copy from the leader.I am trying to understand what
  peer sync is and where does tlog come into picture.Are tlogs replayed
 only
  during server restart?.Can some one  help me with this?
 
  Thanks,
  Nishanth
 



 --
 Regards,
 Shalin Shekhar Mangar.



How to query raw query String with Solrj?

2015-01-22 Thread Tim Molter
I'd like to query solr with solrj with a raw query such as:
`class_id%3ABINGBONG%0ABlah%3A3232235780sort=id+descrows=100`. These
queseries are stored in a database and I cannot use the builder API
offered by solrj (SolrQuery). Any suggestions??



signature.asc
Description: OpenPGP digital signature


Re: Suggester Example In Documentation Not Working

2015-01-22 Thread Tomás Fernández Löbbe
I see that the docs say that the doc needs to be indexed only, but for
Fuzzy or Analyzed, I think the field needs to be stored. On the other side,
not sure how much sense it makes to use any of those two implementations if
the field type you want to have is string.

Tomás

On Thu, Jan 22, 2015 at 8:14 AM, Charles Sanders csand...@redhat.com
wrote:

 Attempting to follow the documentation found here:
 https://cwiki.apache.org/confluence/display/solr/Suggester

 The example given in the documentation is not working. See below my
 configuration. I only changed the field names to those in my schema. Can
 anyone provide an example for this component that actually works?

 searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
 str name=namemySuggester/str
 str name=lookupImplFuzzyLookupFactory/str
 str name=dictionaryImplDocumentDictionaryFactory/str
 str name=fieldsugg_allText/str
 str name=weightFieldsuggestWeight/str
 str name=suggestAnalyzerFieldTypestring/str
 /lst
 /searchComponent

 requestHandler name=/suggest class=solr.SearchHandler startup=lazy
 lst name=defaults
 str name=suggesttrue/str
 str name=suggest.count10/str
 str name=suggest.buildtrue/str
 /lst
 arr name=components
 strsuggest/str
 /arr
 /requestHandler

 field name=sugg_allText type=string indexed=true multiValued=true
 stored=false/
 field name=suggestWeight type=long indexed=true stored=true
 default=1 /



 http://localhost:/solr/collection1/suggest?suggest=truesuggest.build=truesuggest.dictionary=mySuggesterwt=jsonsuggest.q=kern


 {responseHeader:{status:0,QTime:4},command:build,suggest:{mySuggester:{kern:{numFound:0,suggestions:[]



Re: Suggester Example In Documentation Not Working

2015-01-22 Thread Chris Hostetter

1) which version of Solr are you using? (note that the online HTML ref 
guide is a DRARFT that applies to 5.0 - you may want to review the 
specific released version of the ref guide that applies to your version of 
solr: http://archive.apache.org/dist/lucene/solr/ref-guide/

2) the behavior of the suggester is very specific to the contents of the 
dictionary built -- the examples on that page apply to the example docs 
included with solr -- hence the techproduct data, and the example queries 
for input like elec suggesting electronics

no where on that page is an example using the query kern -- wether or 
not that input would return a suggestion is going to be entirely dependent 
on wether the dictionary you built contains any similar terms to suggest.

if you can please post more details about your documents -- ideally a full 
set of all the documents in your index (using a small test index of 
course) that may help to understand the results you are getting.



: Date: Thu, 22 Jan 2015 11:14:43 -0500 (EST)
: From: Charles Sanders csand...@redhat.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Suggester Example In Documentation Not Working
: 
: Attempting to follow the documentation found here: 
: https://cwiki.apache.org/confluence/display/solr/Suggester 
: 
: The example given in the documentation is not working. See below my 
configuration. I only changed the field names to those in my schema. Can anyone 
provide an example for this component that actually works? 
: 
: searchComponent name=suggest class=solr.SuggestComponent 
: lst name=suggester 
: str name=namemySuggester/str 
: str name=lookupImplFuzzyLookupFactory/str 
: str name=dictionaryImplDocumentDictionaryFactory/str 
: str name=fieldsugg_allText/str 
: str name=weightFieldsuggestWeight/str 
: str name=suggestAnalyzerFieldTypestring/str 
: /lst 
: /searchComponent 
: 
: requestHandler name=/suggest class=solr.SearchHandler startup=lazy 
: lst name=defaults 
: str name=suggesttrue/str 
: str name=suggest.count10/str 
: str name=suggest.buildtrue/str 
: /lst 
: arr name=components 
: strsuggest/str 
: /arr 
: /requestHandler 
: 
: field name=sugg_allText type=string indexed=true multiValued=true 
stored=false/ 
: field name=suggestWeight type=long indexed=true stored=true 
default=1 / 
: 
: 
: 
http://localhost:/solr/collection1/suggest?suggest=truesuggest.build=truesuggest.dictionary=mySuggesterwt=jsonsuggest.q=kern
 
: 
: 
{responseHeader:{status:0,QTime:4},command:build,suggest:{mySuggester:{kern:{numFound:0,suggestions:[]
 
: 

-Hoss
http://www.lucidworks.com/


Re: How to query raw query String with Solrj?

2015-01-22 Thread Erik Hatcher
Maybe SolrQueryParsers.parseQueryString() is what you’re looking for.

Erik


 On Jan 22, 2015, at 9:41 AM, Tim Molter tim.mol...@gmail.com wrote:
 
 I'd like to query solr with solrj with a raw query such as:
 `class_id%3ABINGBONG%0ABlah%3A3232235780sort=id+descrows=100`. These
 queseries are stored in a database and I cannot use the builder API
 offered by solrj (SolrQuery). Any suggestions??
 



Re: trying to get Apache Solr working with Dovecot.

2015-01-22 Thread Alexandre Rafalovitch
Well, what does seem to happen?

Which version of Solr is it? Can Dovecot contact Solr? If you put
netcat listen instead of Solr on that port, it is being connected to?

If it is, is Solr complaining about wrong url or anything in the log?
Exceptions maybe.

How far into the Dovecot-Solr path did you trace so far?

Regards,
   Alex.

Sign up for my Solr resources newsletter at http://www.solr-start.com/


On 23 January 2015 at 01:06, Kevin Laurie superinterstel...@gmail.com wrote:
 Hello,

 I am desperately trying to get Apache Solr to work with Dovecot FTS. I
 would really appreciate if someone could please help me!
 I have already done the following:-


 1. I can ssh into my server and see that Apache Solr is up and running.

  ssh -t -L 8983:localhost:8983 u...@mydomain.com


 2. In the collection1 core selector I have the following files:-
 solrconfig.xml and schema.xml
 The schema.xml output is as follows(see link):-

 http://pastebin.com/thGw2pQj

 3. I have installed the dovecot-solr already.

 4. Configured dovecot to run solr-fts as follows:-

 In 10-mail.conf:

 # Space separated list of plugins to load for all services. Plugins
 specific to
 # IMAP, LDA, etc. are added to this list in their own .conf files.
 #mail_plugins =
 mail_plugins = fts fts_solr

 In 90-plugin.conf:-

 plugin {
fts = solr
fts_solr = break-imap-search url=http://localhost:8983/solr/
 }

 Despite all the above the Solr does not seem to run FTS for Dovecot.
 Appreciate if some one could give some feedback

 Thanks
 Kevin


Count total frequency of a word in a SOLR index

2015-01-22 Thread Nitin Solanki
I indexed some text_file files in Solr as it is. Applied 
*StandardTokenizerFactory* and *ShingleFilterFactory* on text_file field

*Configuration of Schema.xml structure below :*
field name=id type=string indexed=true stored=true required=true
multiValued=false /
field name=text_file type=textSpell indexed=true stored=true
required=true multiValued=false/










*fieldType name=textSpell class=solr.TextField
positionIncrementGap=100   analyzer
type=index tokenizer
class=solr.StandardTokenizerFactory/ filter
class=solr.ShingleFilterFactory maxShingleSize=5 minShingleSize=2
outputUnigrams=true/   /analyzer   analyzer
type=query tokenizer
class=solr.StandardTokenizerFactory/ filter
class=solr.ShingleFilterFactory maxShingleSize=5 minShingleSize=2
outputUnigrams=true/  /analyzer/fieldType*

*Stored Documents like:*
*[{id:1, text_file: text: text of document}, {id:2,
text_file: text: text of document} and so on ]*

*Problem* : If I search a word in a SOLR index I get a document count for
documents which contain this word, but if the word is included more times
in a document, the total count is still 1 per document. I need every
returned document is counted for the number of times they have the searched
word in the field. *Example* :I see a numFound value of 12, but the word
what is included 20 times in all 12 documents. Could you help me to find
where I'm wrong, please?


Re: trying to get Apache Solr working with Dovecot.

2015-01-22 Thread Kevin Laurie
Dear Alexandre,
Thanks for your feedback.
The solr / lucene version is 4.10.2

I am trying to figure out how to see if Dovecot and Solr can contact.
Apparently when I make searches there seems to be no contact. I might try
to rebuild dovecot again and see if that solves the problem.

I just checked var/log/solr and its empty. Might need to enable debugging
on Solr.

Regarding tracing, not much as I am still relatively new(might be a
challenge) but will figure out.

Is there any well documented manual for dovecot-solr integration?

Thanks for your feedback!
Regards
Kevin A.




On Fri, Jan 23, 2015 at 1:46 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

 Well, what does seem to happen?

 Which version of Solr is it? Can Dovecot contact Solr? If you put
 netcat listen instead of Solr on that port, it is being connected to?

 If it is, is Solr complaining about wrong url or anything in the log?
 Exceptions maybe.

 How far into the Dovecot-Solr path did you trace so far?

 Regards,
Alex.
 
 Sign up for my Solr resources newsletter at http://www.solr-start.com/


 On 23 January 2015 at 01:06, Kevin Laurie superinterstel...@gmail.com
 wrote:
  Hello,
 
  I am desperately trying to get Apache Solr to work with Dovecot FTS. I
  would really appreciate if someone could please help me!
  I have already done the following:-
 
 
  1. I can ssh into my server and see that Apache Solr is up and running.
 
   ssh -t -L 8983:localhost:8983 u...@mydomain.com
 
 
  2. In the collection1 core selector I have the following files:-
  solrconfig.xml and schema.xml
  The schema.xml output is as follows(see link):-
 
  http://pastebin.com/thGw2pQj
 
  3. I have installed the dovecot-solr already.
 
  4. Configured dovecot to run solr-fts as follows:-
 
  In 10-mail.conf:
 
  # Space separated list of plugins to load for all services. Plugins
  specific to
  # IMAP, LDA, etc. are added to this list in their own .conf files.
  #mail_plugins =
  mail_plugins = fts fts_solr
 
  In 90-plugin.conf:-
 
  plugin {
 fts = solr
 fts_solr = break-imap-search url=http://localhost:8983/solr/
  }
 
  Despite all the above the Solr does not seem to run FTS for Dovecot.
  Appreciate if some one could give some feedback
 
  Thanks
  Kevin