Re: Solrj problem

2014-05-15 Thread blach
I have added the dependency 

org.apache.httpcomponents
httpclient
[4.3.1]




but still giving me the same error.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-problem-tp4135030p4135047.html
Sent from the Solr - User mailing list archive at Nabble.com.


Difference between search strings

2014-05-15 Thread nativecoder
Can someone please tell me the difference between searching a text in the
following ways

1. q=Exact_Word:"samplestring" -> What does it tell to solr  ?

2. q=samplestring&qf=Exact_Word -> What does it tell to solr  ?

3. q="samplestring"&qf=Exact_Word -> What does it tell to solr  ?
 
I think the first and the third one are the same.  is it correct ? How does
it differ from the second one.

I am trying to understand how enclosing the full term in "" is resolving the
solr specific special character problem? What does it tell to solr  ? e.g If
there is "!" mark in the string solr will identify it as a NOT, "!" is part
of the string. This issue can be corrected if the full string is enclosed in
a "". 






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-search-strings-tp4135571.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: dynamic field assignments

2014-05-15 Thread John Thorhauer
Chris,

Thanks so much for the suggestion.  I will look into this approach.  It
looks very promising!

John


On Mon, May 5, 2014 at 9:50 PM, Chris Hostetter wrote:

>
> : My understanding is that DynamicField can do something like
> : FOO_BAR_TEXT_* but what I really need is *_TEXT_* as I might have
> : FOO_BAR_TEXT_1 but I also might have WIDGET_BAR_TEXT_2.  Both of those
> : field names need to map to a field type of 'fullText'.
>
> I'm pretty sure you can get what you are after with the new Manged Schema
> functionality...
>
> https://cwiki.apache.org/confluence/display/solr/Schemaless+Mode
>
> https://cwiki.apache.org/confluence/display/solr/Managed+Schema+Definition+in+SolrConfig
>
> Assuming you have managed schema enabled in solrconfig.xml, and you define
> both of your fieldTypes using names like "text" and "select" then
> something like this should work in your processor chain...
>
>  
>.*_TEXT_.*
>text
>  
>  
>.*_SELECT_.*
>select
>  
>
>
> (Normally that processor is used once with multiple value->type mappings
> -- but in your case you don't care about the run-time value, just the run
> time field name regex (which should also be configurable according
> to the various FieldNameSelector rules...
>
>
> https://lucene.apache.org/solr/4_8_0/solr-core/org/apache/solr/update/processor/AddSchemaFieldsUpdateProcessorFactory.html
>
> https://lucene.apache.org/solr/4_8_0/solr-core/org/apache/solr/update/processor/FieldMutatingUpdateProcessorFactory.html
>
>
> -Hoss
> http://www.lucidworks.com/
>



-- 
John Thorhauer
Director/Remote Interfaces
Yakabod, Inc.
301-662-4554 x2105


Re: Solrj Default Data Format

2014-05-15 Thread Furkan KAMACI
Hi;

I found the reason of weird format at my previous mail. Now I capture the
data with wireshark and I see that it is pure XML and content type is set
to application/xml?

Any ideas about why it is not javabin?

Thanks;
Furkan KAMACI


2014-05-07 22:16 GMT+03:00 Furkan KAMACI :

> Hmmm, I see that it is like XML format but not. I have added three
> documents but has something like that:
>
> 
> 
> id1
> id2
> id3
> id4
> d1
> d2
> d3
> d4
> 
> 
> 
> 
> 
>
> is this javabin format? I mean optimizing XML and having a first byte of
> "2"?
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-05-07 22:04 GMT+03:00 Furkan KAMACI :
>
> Hi;
>>
>> I am testing Solrj. I use Solr 4.5.1 and HttpSolrServer for my test. I
>> just generate some SolrInputDocuments and call add method of server to add
>> them. When  I track the request I see that data is at XML format instead of
>> javabin. Do I miss anything?
>>
>> Thanks;
>> Furkan KAMACI
>>
>
>


AnalyzingInfixLookupFactory with multiple cores

2014-05-15 Thread Michael Sokolov
It seems as if the location of the suggester dictionary directory is not 
core-specific, so when the suggester is defined for multiple cores, they 
collide: you get exceptions attempting to obtain the lock, and the 
suggestions bleed from one core to the other.   There is an 
(undocumented) "indexPath" parameter that can be used to control this, 
so I think I can work around the problem using that, but it would be a 
nice feature if the suggester index directory were relative to the core 
directory rather than the current working directory of the process.


Question: is the current core directory (or even its name) available as 
a variable that gets substituted in solrconfig.xml?  I.e. ${core-name} 
or something?


-Mike


Replica as a "leader"

2014-05-15 Thread adfel70
/Solr &Collection Info:/
Solr 4.8 , 4 shards, 3 replicas per shard, 30-40 million docs per shard.

/Process:/
1. Indexing 100-200 docs per second.
2. Doing Pkill -9 java to 2 replicas (not the leader) in shard 3 (while
indexing).
3. Indexing for 10-20 minutes and doing hard commit. 
4. Doing Pkill -9 java to the leader and then starting one replica in shard
3 (while indexing).
5. After 20 minutes starting another replica in shard 3 ,while indexing (not
the leader in step 1). 
6. After 10 minutes starting the rep that was the leader in step 1. 

/Results:/
2. Only the leader is active in shard 3.
3. Thousands of docs were added to the leader in shard 3.
4. After staring the replica, it's state was down and after 10 minutes it
became the leader in cluster state (and still down). no servers hosting
shards for index and search requests.
*5. After starting another replica, it's state was recovering for 2-3
minutes and then it became active (not leader in cluster state).
   Index, commit and search requests are handled in the other replica
(active status, not leader!!!). 
   The search Results not includes docs that have been indexed to the leader
in step 3.  *
6. syncing with the active rep. 

/Expected:/
*5. To stay in down status.
   Not to handle index, commit and search requests - no servers hosting
shards!*
6. Become the leader.

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replica-as-a-leader-tp4135078.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: retreive all the fields in join

2014-05-15 Thread Mikhail Khludnev
On Sun, May 11, 2014 at 12:14 PM, Aman Tandon wrote:

> Is it possible?


no.


-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics


 


Re: Too many documents Exception

2014-05-15 Thread yamazaki
Tanks, Jack.

Is there a way to suppress setting this exception?

For example,
2147483647 ?


When this exception occurs, Index will not be read.
If solrcloud  is used, some data not read.

shard1 documents 2^31-1 over
shard2 documents 2^31-1 not over

shard1 down. shard1 index is dead.

-- yamazaki


2014-05-07 11:01 GMT+09:00 Jack Krupansky :
> Lucene only supports 2^31-1 documents in an index, so Solr can only support
> 2^31-1 documents in a single shard.
>
> I think it's a bug that Lucene doesn't throw an exception when more than
> that number of documents have been inserted. Instead, you get this error
> when Solr tries to read such an overstuffed index.
>
> -- Jack Krupansky
>
> -Original Message- From: [Tech Fun]山崎
> Sent: Tuesday, May 6, 2014 8:54 PM
> To: solr-user@lucene.apache.org
> Subject: Too many documents Exception
>
>
> Hello everybody,
>
> Solr 4.3.1(and 4.7.1), Num Docs + Deleted Docs >
> 2147483647(Integer.MAX_VALUE) over
> Caused by: java.lang.IllegalArgumentException: Too many documents,
> composite IndexReaders cannot exceed 2147483647
>
> It seems to be trouble similar to the unresolved e-mail.
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201307.mbox/browser
>
> If How can I fix this?
> This Solr Specification?
>
>
> log.
>
> ERROR org.apache.solr.core.CoreContainer  – Unable to create core:
> collection1
> org.apache.solr.common.SolrException: Error opening new searcher
>at org.apache.solr.core.SolrCore.(SolrCore.java:821)
>at org.apache.solr.core.SolrCore.(SolrCore.java:618)
>at
> org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:949)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:984)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.solr.common.SolrException: Error opening new searcher
>at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1438)
>at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1550)
>at org.apache.solr.core.SolrCore.(SolrCore.java:796)
>... 13 more
> Caused by: org.apache.solr.common.SolrException: Error opening Reader
>at
> org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:172)
>at
> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:183)
>at
> org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:179)
>at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1414)
>... 15 more
> Caused by: java.lang.IllegalArgumentException: Too many documents,
> composite IndexReaders cannot exceed 2147483647
>at
> org.apache.lucene.index.BaseCompositeReader.(BaseCompositeReader.java:77)
>at
> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:368)
>at
> org.apache.lucene.index.StandardDirectoryReader.(StandardDirectoryReader.java:42)
>at
> org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:71)
>at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:783)
>at
> org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:52)
>at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:88)
>at
> org.apache.solr.core.StandardIndexReaderFactory.newReader(StandardIndexReaderFactory.java:34)
>at
> org.apache.solr.search.SolrIndexSearcher.getReader(SolrIndexSearcher.java:169)
>... 18 more
> ERROR org.apache.solr.core.CoreContainer  –
> null:org.apache.solr.common.SolrException: Unable to create core:
> collection1
>at
> org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1450)
>at org.apache.solr.core.CoreContainer.create(CoreContainer.java:993)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:597)
>at org.apache.solr.core.CoreContainer$2.call(CoreContainer.java:592)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
>at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:8

Re: Inconsistent response from Cloud Query

2014-05-15 Thread Vineet Mishra
Hi Shawn,

There is no recovery case for me, neither the commit is pending.
The case I am talking about is when I restart the Cloud all over again with
index already flushed to disk.

Thanks!


On Sun, May 11, 2014 at 10:17 PM, Shawn Heisey  wrote:

> On 5/9/2014 11:42 AM, Cool Techi wrote:
> > We have noticed Solr returns in-consistent results during replica
> recovery and not all replicas are in the same state, so when your query
> goes to a replica which might be recovering or still copying the index then
> the counts may differ.
> > regards,Ayush
>
> SolrCloud should never send requests to a replica that is recovering.
> If that is happening (which I think is unlikely), then it's a bug.
>
> If *you* send a request to a replica that is still recovering, I would
> expect SolrCloud to redirect the request elsewhere unless distrib=false
> is used.  I'm not sure whether that actually happens, though.
>
> Thanks,
> Shawn
>
>


RE: Solr + SPDY

2014-05-15 Thread Markus Jelsma
Hi Harsh,

 
Does SPDY provide lower latency than HTTP/1.1 with KeepAlive or is it 
encryption that you're after?

 
Markus


 
-Original message-
From:harspras 
Sent:Tue 13-05-2014 05:38
Subject:Re: Solr + SPDY
To:solr-user@lucene.apache.org; 
Hi Vinay,

I have been trying to setup a similar environment with SPDY being enabled
for Solr inter shard communication. Did you happen to have been able to do
it? I somehow cannot use SolrCloud with SPDY enabled in jetty.

Regards,
Harsh Prasad



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-SPDY-tp4097771p4135377.html
Sent from the Solr - User mailing list archive at Nabble.com.


KeywordTokenizerFactory splits the string for the exclamation mark

2014-05-15 Thread Romani Rupasinghe
Hi All

I have a following field settings in solr schema









As you can see Exact_Word has the KeywordTokenizerFactory and that should
treat the string as it is.

Following is my responseHeader. As you can see I am searching my string
only in the filed Exact_Word and expecting it to return the Word field and
the score

"responseHeader":{
"status":0,
"QTime":14,
"params":{
  "explainOther":"",
  "fl":"Word,score",
  "debugQuery":"on",
  "indent":"on",
  "start":"0",
  "q":"d!sdasdsdwasd!a...@dsadsadas.edu",
  "qf":"Exact_Word",
  "wt":"json",
  "fq":"",
  "version":"2.2",
  "rows":"10"}},


But when I enter email with the following string "d!
sdasdsdwasd...@dsadsadas.edu" it splits the string to two. I was under the
impression that KeywordTokenizerFactory will treat the string as it is.

Following is the query debug result. There you can see it has split the word
 "parsedquery":"+((DisjunctionMaxQuery((Exact_Word:d))
-DisjunctionMaxQuery((Exact_Word:sdasdsdwasd...@dsadsadas.edu)))~1)",

can someone please tell why it produce the query result as this

If I put a string without the "!" sign as below, the produced query will be
as below
 "parsedquery":"+DisjunctionMaxQuery((
Exact_Word:d_sdasdsdwasd_...@dsadsadas.edu))",. This is what I expected
solr to even with the "!" mark. with "_" mark it wont do a string split and
treats the string as it is

I thought if the KeywordTokenizerFactory is applied then it should return
the exact string as it is

Please help me to understand what is going wrong here


deep paging without sorting / keep IRs open

2014-05-15 Thread Tommaso Teofili
Hi all,

in one use case I'm working on [1] I am using Solr in combination with a
MVCC system [2][3], so that the (Solr) index is kept up to date with the
system and must handle search requests that are tied to a certain state /
version of it and of course multiple searches based on different versions
of the system have to run together.

So to make an example an indexing request (with commit) creates doc x and
y, a search for all the docs retrieves x and y, then a second indexing
requests (with commit) adds doc z, a search for all the docs retrieves x y
and z; that's fine as soon as the number of results is not big, but if
search requests are paged (with start and rows parameters) then the above
example doesn't work as multiple requests with underlying changing data
would have to be done to get pages.
In the above scenario if rows = 1 then the first request would retrieve 1
doc at a time, with a 'numFound' changed on the second request (from 2 to
3) which would be not consistent.

Basically I need the ability to keep running searches against a specified
commit point / index reader / state of the Lucene / Solr index.
So I wonder if a similar thing like the one done for "cursorMark" can be
done in order to address that, of course such "long running IndexReaders"
would have to be disposed after some time.

WDYT?
Regards,
Tommaso

[1] : http://jackrabbit.apache.org/oak
[2] : http://en.wikipedia.org/wiki/Multiversion_concurrency_control
[3] :
http://wiki.apache.org/jackrabbit/RepositoryMicroKernel?action=AttachFile&do=view&target=MicroKernel+Revision+Model.pdf


Re: Help to Understand a Solr Query

2014-05-15 Thread nativecoder
Hi All

I have a following field settings in solr schema

Exact_Word" omitPositions="true" termVectors="false"
omitTermFreqAndPositions="true" compressed="true" type="string_ci"
multiValued="false" indexed="true" stored="true" required="false"
omitNorms="true"/>







As you can see Exact_Word has the KeywordTokenizerFactory and that should
treat the string as it is.

Following is my responseHeader. As you can see I am searching my string only
in the filed Exact_Word and expecting it to return the Word field and the
score

"responseHeader":{
"status":0,
"QTime":14,
"params":{
  "explainOther":"",
  "fl":"Word,score",
  "debugQuery":"on",
  "indent":"on",
  "start":"0",
  "q":"d!sdasdsdwasd!a...@dsadsadas.edu",
  "qf":"Exact_Word",
  "wt":"json",
  "fq":"",
  "version":"2.2",
  "rows":"10"}},


But when I enter email with the following string
"d!sdasdsdwasd...@dsadsadas.edu" it splits the string to two. I was under
the impression that KeywordTokenizerFactory will treat the string as it is.

Following is the query debug result. There you can see it has split the word
 "parsedquery":"+((DisjunctionMaxQuery((Exact_Word:d))
-DisjunctionMaxQuery((Exact_Word:sdasdsdwasd...@dsadsadas.edu)))~1)",

can someone please tell why it produce the query result as this

If I put a string without the "!" sign as below, the produced query will be
as below

"parsedquery":"+DisjunctionMaxQuery((Exact_Word:d_sdasdsdwasd_...@dsadsadas.edu))",.
This is what I expected solr to even with the "!" mark. with "_" mark it
wont do a string split and treats the string as it is

I thought if the KeywordTokenizerFactory is applied then it should return
the exact string as it is

Please help me to understand what is going wrong here 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-to-Understand-a-Solr-Query-tp4134686p4135464.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Replica active during warming

2014-05-15 Thread lboutros
In other words, is there a way for the LBHttpSolrServer to ignore replicas
which are currently "cold" ?

Ludovic.



-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Replica-active-during-warming-tp4135274p4135542.html
Sent from the Solr - User mailing list archive at Nabble.com.


Reiterating again in the solr returned result set

2014-05-15 Thread rio
Hi All

I am given a requirement to check whether the user entered word is already
entered by some other user. It has been suggested to do a solr search for
this. Following is the schema for the word field type 

Exact_Word" omitPositions="true" termVectors="false"
omitTermFreqAndPositions="true" compressed="true" type="string_ci"
multiValued="false" indexed="true" stored="true" required="false"
omitNorms="true"/>







As this should be a "exact" search not a "contains" search above solr schema
is presented and the text search will happen on the Exact_Word field. 

my search string would be somthing like below

q="SampleString"&qf=Exact_Word&defType=edisMax&fl=Word

Once the search happens it will return list of documents which has the
entered string in the WORD field. Right now I am taking a count of this and
if it is more than 0 I assume that the string is presented. Can I depend on
this search result or do I need to re iterate in the solr returned result
set and again search for the result.

Please advice 












--
View this message in context: 
http://lucene.472066.n3.nabble.com/Reiterating-again-in-the-solr-returned-result-set-tp4135579.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Error when creating collection

2014-05-15 Thread Shawn Heisey
On 5/13/2014 4:39 PM, Mark Olsen wrote:
> I'm creating a collection via Java using this function call: 
>
> String collection = "profile-2"; 
> CoreAdminRequest.Create createRequest = new CoreAdminRequest.Create(); 
> createRequest.setCoreName(collection); 
> createRequest.setCollection(collection); 
> createRequest.setInstanceDir(collection); 
> createRequest.setNumShards(1); 
> createRequest.process(server); 
>
> It is timing out with this exception (from the solr.out logs): 
>
> SEVERE: org.apache.solr.common.SolrException: Error CREATEing SolrCore 
> 'profile-2': Could not get shard_id for core: profile-2 
> coreNodeName:192.168.1.152:8983_solr_profile-2 
> at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleCreateAction(CoreAdminHandler.java:483)
>  
> at 
> org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:140)
>  
> at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>  
> at 
> org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:591)
>  
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:192)
>  
> at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
>  
> ... 
> Caused by: org.apache.solr.common.SolrException: Could not get shard_id for 
> core: profile-2 coreNodeName:192.168.1.152:8983_solr_profile-2 
> at 
> org.apache.solr.cloud.ZkController.doGetShardIdProcess(ZkController.java:1221)
>  
> at org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1290) 
> at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:861) 
>
> In a "development" environment the zookeeper/solr instances are running with 
> elevated permissions and this function worked without error. 
> In a "test" environment (which matches the "production" environment) the 
> permissions are more restricted. I made sure the group/owner of the 
> /usr/local/solr directory are set up to be the correct user. 

This is happening because you never set the shard ID.  See the "Caused
by" message above.  There is a setShardID method on the class that you
are using.  I believe this would typically get something like "shard1"
as a value.

The user that runs Solr must typically have write permissions to the
solr home and all of its descendants.

Note that with the CoreAdminRequest class, you are not creating a
collection.  You are creating a core.  If you want to create an entire
collection (which will typically create at least two cores on different
Solr instances), you need to use CollectionAdminRequest instead.

https://lucene.apache.org/solr/4_8_0/solr-solrj/org/apache/solr/client/solrj/request/CollectionAdminRequest.Create.html

http://wiki.apache.org/solr/SolrTerminology

Thanks,
Shawn



Re: Easises way to insatll solr cloud with tomcat

2014-05-15 Thread Greg Walters
While solr can run under tomcat, the (strongly) recommended container is the 
jetty that comes with solr. In my experience it's possible to just deploy the 
solr.war to tomcat like any other J2EE app but it runs better under the 
included jetty.

Thanks,
Greg

On May 14, 2014, at 9:39 AM, Matt Kuiper  wrote:

> Check out http://heliosearch.com/download.html  
> 
> This is a distribution of Apache Solr packaged with Tomcat.
> 
> I have found it simple to use.
> 
> Matt
> 
> -Original Message-
> From: Aman Tandon [mailto:amantandon...@gmail.com] 
> Sent: Monday, May 12, 2014 6:24 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Easises way to insatll solr cloud with tomcat
> 
> Can anybody help me out??
> 
> With Regards
> Aman Tandon
> 
> 
> On Mon, May 12, 2014 at 1:24 PM, Aman Tandon wrote:
> 
>> Hi,
>> 
>> I tried to set up solr cloud with jetty which works fine. But in our 
>> production environment we uses tomcat so i need to set up the solr 
>> cloud with the tomcat. So please help me out to how to setup solr 
>> cloud with tomcat on single machine.
>> 
>> Thanks in advance.
>> 
>> With Regards
>> Aman Tandon
>> 



Re: Help to Understand a Solr Query

2014-05-15 Thread Jack Krupansky
Please don't re-use an existing message thread for a new, completely 
independent question!


Also, try to make the subject line indicate something about the actual 
issue.


-- Jack Krupansky

-Original Message- 
From: nativecoder

Sent: Tuesday, May 13, 2014 10:56 AM
To: solr-user@lucene.apache.org
Subject: Re: Help to Understand a Solr Query

Hi All

I have a following field settings in solr schema

Exact_Word" omitPositions="true" termVectors="false"
omitTermFreqAndPositions="true" compressed="true" type="string_ci"
multiValued="false" indexed="true" stored="true" required="false"
omitNorms="true"/>







As you can see Exact_Word has the KeywordTokenizerFactory and that should
treat the string as it is.

Following is my responseHeader. As you can see I am searching my string only
in the filed Exact_Word and expecting it to return the Word field and the
score

"responseHeader":{
   "status":0,
   "QTime":14,
   "params":{
 "explainOther":"",
 "fl":"Word,score",
 "debugQuery":"on",
 "indent":"on",
 "start":"0",
 "q":"d!sdasdsdwasd!a...@dsadsadas.edu",
 "qf":"Exact_Word",
 "wt":"json",
 "fq":"",
 "version":"2.2",
 "rows":"10"}},


But when I enter email with the following string
"d!sdasdsdwasd...@dsadsadas.edu" it splits the string to two. I was under
the impression that KeywordTokenizerFactory will treat the string as it is.

Following is the query debug result. There you can see it has split the word
"parsedquery":"+((DisjunctionMaxQuery((Exact_Word:d))
-DisjunctionMaxQuery((Exact_Word:sdasdsdwasd...@dsadsadas.edu)))~1)",

can someone please tell why it produce the query result as this

If I put a string without the "!" sign as below, the produced query will be
as below

"parsedquery":"+DisjunctionMaxQuery((Exact_Word:d_sdasdsdwasd_...@dsadsadas.edu))",.
This is what I expected solr to even with the "!" mark. with "_" mark it
wont do a string split and treats the string as it is

I thought if the KeywordTokenizerFactory is applied then it should return
the exact string as it is

Please help me to understand what is going wrong here



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-to-Understand-a-Solr-Query-tp4134686p4135464.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Indexing DateField timezone problem

2014-05-15 Thread Ahmet Arslan
Hi Hakan,

You could set -Duser.timezone="UTC" when starting solr.

Ahmet



On Wednesday, May 14, 2014 2:46 PM, hakanbillur  
wrote:
 
 

Hi,

I have a problem about indexing UTC date format to solr from DB. For
example, in DB, date:"2014-05-01 23:59:00" and same date: "date":
"2014-05-01T20:59:00Z" in solr. 
There are time diifference -3 hours! (For Turkey).

you can see about two captures on the right side.

i hope, someone can help me.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-DateField-timezone-problem-tp4135079.html
Sent from the Solr - User mailing list archive at Nabble.com.



is it possible for solr to calculate and give back the price of a product based on its sub-products

2014-05-15 Thread Mohamed23
Hi,

I am using Solr for searching magento products in my project,
I want to know, is it possible for solr to calculate and give back the price
of a product based on its sub-products(items);

For instance, i have a product P1 and it is the parent of items m1, m2.
i need to get the minimal price of items and return it as a price of product
P1.

I'm wondering if that is possible, can you help me ?
I need to know if solr can do that or if there is a feature or a way to do
it ?
And finally i thank you!

regards,
Mohamed.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/is-it-possible-for-solr-to-calculate-and-give-back-the-price-of-a-product-based-on-its-sub-products-tp4135081.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solrj Default Data Format

2014-05-15 Thread Furkan KAMACI
Hi;

I am testing Solrj. I use Solr 4.5.1 and HttpSolrServer for my test. I just
generate some SolrInputDocuments and call add method of server to add them.
When  I track the request I see that data is at XML format instead of
javabin. Do I miss anything?

Thanks;
Furkan KAMACI


Re: Physical Files v. Reported Index Size

2014-05-15 Thread Otis Gospodnetic
Darrell,

Look at the top index.x directory in your second image.  Looks like
that's your index, the same one you see in the Solr UI.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/



On Tue, May 6, 2014 at 11:34 PM, Darrell Burgan wrote:

>  Hello all, I’m trying to reconcile what I’m seeing in the file system
> for a Solr index versus what it is reporting in the UI. Here’s what I see
> in the UI for the index:
>
>
>
> https://s3-us-west-2.amazonaws.com/pa-darrell/ui.png
>
>
>
> As shown, the index is 74.85 GB in size. However, here is what I see in
> the data folder of the file system on that server:
>
>
>
> https://s3-us-west-2.amazonaws.com/pa-darrell/file-system.png
>
>
>
> As shown, it is consuming 109 GB of space. Also note that one of the index
> folders is 75 GB in size.
>
>
>
> My question is why the difference, and whether I can remove some of these
> index folders to reclaim file system space? Or is there a Solr command to
> do it (is it as obvious as “Optimize”)?
>
>
>
> If there a manual I should RTFM about the file structure, please point me
> to it.  J
>
>
>
> Thanks!
>
> Darrell
>
>
>
>
>
> [image: Description: Infor] 
>
> *Darrell Burgan* | Architect, Sr. Principal, PeopleAnswers
>
> office: 214 445 2172 | mobile: 214 564 4450 | fax: 972 692 5386 |
> darrell.bur...@infor.com | http://www.infor.com
>
> CONFIDENTIALITY NOTE: This email (including any attachments) is
> confidential and may be protected by legal privilege. If you are not the
> intended recipient, be aware that any disclosure, copying, distribution, or
> use of the information contained herein is prohibited.  If you have
> received this message in error, please notify the sender by replying to
> this message and then delete this message in its entirety. Thank you for
> your cooperation.
>
>
>


Re: Website running Solr

2014-05-15 Thread Michael Sokolov

On 5/11/2014 12:55 PM, Olivier Austina wrote:

Hi All,
Is there a way to know if a website use Solr? Thanks.
Regards
Olivier


Ask the people who run the site?


New equivalent to QueryParsing.parseQuery()?

2014-05-15 Thread Jeff Leedy
I have some older code that works as expected in Solr 3.4:

final IndexSchema indexSchema = new IndexSchema(
  new SolrConfig(solrHome +
"/repository","solrconfig.xml",null), "schema.xml", null);
final Query luceneQuery = QueryParsing.parseQuery(
  query, "text", indexSchema);
luceneIndex.getIndexSearcher().search(luceneQuery, collector);

This appears to suck in all of the good stuff from the solrconfig.xml and
schema.xml, which is great. However, for Solr 4.0, I'm trying to find an
equivalent to QueryParsing.parseQuery() (which no longer exists) that lets
me incorporate these config files as before. I'm (naively?) trying the
following:

final StandardQueryParser parser = new StandardQueryParser();
final Query luceneQuery = parser.parse(query, "text");
luceneIndex.getIndexSearcher().search(luceneQuery, collector);

However, the behavior of the StandardQueryParser seems to be different
enough to make some previously good queries fail, and I've not found a new
way to incorporate the xml config files. It seems silly to manually
reconstitute the relevant analyzers, filters, etc. from the schema in this
query code in my application. Is there a 4.0 equivalent to the older code
that works similarly, or are things more complicated?

Thanks in advance...

Jeff




--
View this message in context: 
http://lucene.472066.n3.nabble.com/New-equivalent-to-QueryParsing-parseQuery-tp4135050.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrj Default Data Format

2014-05-15 Thread Furkan KAMACI
Hi;

I have resolved my problem. I think that there is another problem with
Solrj. I will send it another thread.

Thanks;
Furkan KAMACI


2014-05-08 17:17 GMT+03:00 Furkan KAMACI :

> Hi;
>
> I found the reason of weird format at my previous mail. Now I capture the
> data with wireshark and I see that it is pure XML and content type is set
> to application/xml?
>
> Any ideas about why it is not javabin?
>
> Thanks;
> Furkan KAMACI
>
>
> 2014-05-07 22:16 GMT+03:00 Furkan KAMACI :
>
> Hmmm, I see that it is like XML format but not. I have added three
>> documents but has something like that:
>>
>> 
>> 
>> id1
>> id2
>> id3
>> id4
>> d1
>> d2
>> d3
>> d4
>> 
>> 
>> 
>> 
>> 
>>
>> is this javabin format? I mean optimizing XML and having a first byte of
>> "2"?
>>
>> Thanks;
>> Furkan KAMACI
>>
>>
>> 2014-05-07 22:04 GMT+03:00 Furkan KAMACI :
>>
>> Hi;
>>>
>>> I am testing Solrj. I use Solr 4.5.1 and HttpSolrServer for my test. I
>>> just generate some SolrInputDocuments and call add method of server to add
>>> them. When  I track the request I see that data is at XML format instead of
>>> javabin. Do I miss anything?
>>>
>>> Thanks;
>>> Furkan KAMACI
>>>
>>
>>
>


Highlighting: single id:... x1000 vs (id: OR id: ... x1000)

2014-05-15 Thread Jochen Barth
Dear reader,

I'll like to highlight very large ocr docs (termvectors=true etc.).
Therefore I've made a separate "highlight store collection" where i'll
want to higlight ids selected from an other query from a separate
collection (containing the same ids).

Now querying like this:


q=ocr:abc AND id:x1 hl=true hl.fl=ocr hl.useFastVector... ...
q=ocr:abc AND id:x2 hl=true hl.fl=ocr hl.useFastVector..
q=ocr:abc AND id:x3 hl=true hl.fl=ocr hl.useFastVector..
q=ocr:abc AND id:x4 hl=true hl.fl=ocr hl.useFastVector..

... till x1000 works very much faster than

q=ocr:abc AND (id:x1 OR id:x2 OR id:x3 OR id... ... id:x1000)

Why?

Kind regards,
Jochen barth


-- 
J. Barth * IT, Universitaetsbibliothek Heidelberg * 06221 / 54-2580

pgp public key:
http://digi.ub.uni-heidelberg.de/barth%40ub.uni-heidelberg.de.asc


callback with active state of solr core (solr ver 4.3.1)

2014-05-15 Thread ronak kirit
Hello,

We are using solr ver 4.3.1 and running them in solrcloud mode. We would
like to keep some dynamic configs under data directory of every shards
and/or replica of a collection. I would like to know that if nodes in not
in active state (lets say it is in recovery or other stats), and if it
comes back to active state, is there any callback can be registered for
active state? If the component is solrcoreaware, would that component's
inform method called every time, the nodes come back to active state?

Thanks,
Ronak


Re: search multiple cores

2014-05-15 Thread Alvaro Cabrerizo
As far as I know (and how i have been using it), the join can't do what you
want. The structure of the query you could try (among others) is :

1. http://SOLR_ADDRESS/coreA/select?q=A&fq={!join ... fromCore=coreB}B
2. http://SOLR_ADDRESS/coreA/select?q=A AND
_query_:"{!join ...
fromCore=coreB}B"

Where:

   - A is a constraint over the documents of coreA (the documents returned
   by the query belong to this core).
   - B is a constraint over the documents in the coreB
   - fq is a constraint that have to satisfy documents in core A that
   depends on documents of B (query 1.)
   - The nested query in 2. is similar to the fq in query 1.

If I've understood your requirement, you would like to get documents from
coreA that satisfy a condition depending on documents of coreB, and those
documents of coreB should also satisfy a condition from documents of coreC.
This kind of transitivity (A<-B<-C) is the one I think can't be addressed
by the join parser. In the structure of the former presented queries I
can't guess how to include the constraint between coreB and coreC.

In case you have three cores in action, the query you could execute (not
tested but I can't see any issue) would look like this:

3. 
http://SOLR_ADDRESS/coreA/select?q=A&fq={!join
...
from=coreB}B&fq={!join... fromCore=coreC}C
4. http://SOLR_ADDRESS/coreA/select?q=A AND
_query_:"{!join ...
fromCore=coreB}B" AND
_query_:"{!join ...
fromCore=coreC}C"

But in this case there is no a "transitive" restriction but independent
conditions between coreA - coreB and coreA - coreC.


Regards.


On Wed, May 14, 2014 at 5:27 AM, Jay Potharaju wrote:

> Hi,
> I am trying to join across multiple cores using query time join. Following
> is my setup
> 3 cores - Solr 4.7
> core1:  0.5 million documents
> core2: 4 million documents and growing. This contains the child documents
> for documents in core1.
> core3: 2 million documents and growing. Contains records from all users.
>
>  core2 contains documents that are accessible to each user based on their
> permissions. The number of documents accessible to a user range from couple
> of 1000s to 100,000.
>
> I would like to get results by combining all three cores. For each search I
> get documents from core3 and then query core1 to get parent documents &
> then core2 to get the appropriate child documents depending of user
> permissions.
>
> I 'm referring to this link to join across cores
>
> http://stackoverflow.com/questions/12665797/is-solr-4-0-capable-of-using-join-for-multiple-core
>
> {!join from=fromField to=toField fromIndex=fromCoreName}fromQuery
>
> This is not working for me. Can anyone suggest why it is not working. Any
> pointers on how to search across multiple cores.
>
> thanks
>
>
>
> J
>


Indexing DateField timezone problem

2014-05-15 Thread hakanbillur
 
 

Hi,

I have a problem about indexing UTC date format to solr from DB. For
example, in DB, date:"2014-05-01 23:59:00" and same date: "date":
"2014-05-01T20:59:00Z" in solr. 
There are time diifference -3 hours! (For Turkey).

you can see about two captures on the right side.

i hope, someone can help me.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-DateField-timezone-problem-tp4135079.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: KeywordTokenizerFactory splits the string for the exclamation mark

2014-05-15 Thread Ahmet Arslan
Hi,

It is not KeywordTokenizer, ! character has a special meaning to edismax and 
lucene query parser. It is NOT operator.
If you want to search strings that could contain !, then use other query 
parsers. dismax for example. 





On Wednesday, May 14, 2014 12:02 AM, nativecoder  wrote:
Hi All

I have a following field settings in solr schema

Exact_Word*" omitPositions="true"
termVectors="false" omitTermFreqAndPositions="true" compressed="true"
type="string_ci" multiValued="false" indexed="true" stored="true"
required="false" omitNorms="true"/>







As you can see Exact_Email has the KeywordTokenizerFactory and that should
treat the string as it is.

But when I enter email with the following string
"d!sdasdsdwasd...@dsadsadas.edu" it splits the string to two. I was under
the impression that KeywordTokenizerFactory will treat the string as it is.
*!*
Following is the query debug result. There you can see it has split the word 
"parsedquery":"+((DisjunctionMaxQuery((Exact_Email:d))
-DisjunctionMaxQuery((Exact_Email:sdasdsdwasd...@dsadsadas.edu)))~1)",

can someone please tell why it produce the query result as this 

If I put a string without the "!" sign as below, the produced query will be
as below

"parsedquery":"+DisjunctionMaxQuery((Exact_Email:testresu...@testdomain.com))",

I thought if the KeywordTokenizerFactory is applied then it should return
the exact string as it is

Please help me to understand what is going wrong here




--
View this message in context: 
http://lucene.472066.n3.nabble.com/KeywordTokenizerFactory-splits-the-string-for-the-exclamation-mark-tp4135460.html
Sent from the Solr - User mailing list archive at Nabble.com.



Is it possible for solr to calculate and give back the price of a product based on its sub-products

2014-05-15 Thread gharbi mohamed
Hi,

I am using Solr for searching magento products in my project,
I want to know, is it possible for solr to calculate and give back the
price of a product based on its sub-products(items);

For instance, i have a product P1 and it is the parent of items m1, m2.
i need to get the minimal price of items and return it as a price of
product P1.

I'm wondering if that is possible ?
I need to know if solr can do that or if there is a feature or a way to do
it ?
And finally i thank you!

regards,
Mohamed.


Re: query(subquery, default) filters results

2014-05-15 Thread Matteo Grolla
Thanks very much,
i realized too late that that I skipped an important part of the wiki 
documentation "this example assumes /detType=func"

thanks a lot

Il giorno 06/mag/2014, alle ore 21:05, Yonik Seeley ha scritto:

> On Tue, May 6, 2014 at 5:08 AM, Matteo Grolla  wrote:
>> Hi everybody,
>>I'm having troubles with the function query
>> 
>> "query(subquery, default)"  
>> http://wiki.apache.org/solr/FunctionQuery#query
>> 
>> running this
>> 
>> http://localhost:8983/solr/select?q=query($qq,1)&qq={!dismax qf=text}hard 
>> drive
> 
> The default query syntax is lucene, so "query(..." will just be parsed as 
> text.
> Try q={!func}query($qq,1)
> OR
> defType=func&q=query($qq,1)
> 
> -Yonik
> http://heliosearch.org - facet functions, subfacets, off-heap filters
> + fieldcache



Re: SolrMeter is dead?

2014-05-15 Thread Sameer Maggon
Have you looked at JMeter - http://jmeter.apache.org/

Thanks,
Sameer.
--
http://measuredsearch.com


On Wed, May 7, 2014 at 7:51 AM, Al Krinker  wrote:

> I am trying to test performance of my cluster (solr 4.8).
>
> SolrMeter looked promising... small and standalone. Plus, open source so
> that I could make tweaks if needed.
>
> However, I see that the last update date was in Oct 2012. Is it dead? Any
> better non commercial and preferably open sourced projects out there?
>
> Thanks,
> Al
>


Storing tweets For WC2014

2014-05-15 Thread Cool Techi
Hi,
We have a requirement from one of our customers to provide search and analytics 
on the upcoming Soccer World cup, given the sheer volume of tweet's that would 
be generated at such an event I cannot imagine what would be required to store 
this in solr. 
It would be great if there can be some pointer's on the scale or hardware 
required, number of shards that should be created etc. Some requirement,
All the tweets should be searchable (approximately 100million tweets/date  * 60 
Days of event). All fields on tweets should be searchable/facet on numeric and 
date fields. Facets would be run on TwitterId's (unique users), tweet created 
on date, Location, Sentiment (some fields which we generate)

If anyone has attempted anything like this it would be helpful.
Regards,Rohit
  

Re: range types in SOLR

2014-05-15 Thread Ere Maijala

David,

thanks, looking forward to LUCENE-5648. I added a comment about 
supporting BC dates. We currently use the spatial support to index date 
ranges with a precision of one day, ranging from year - to .


Just for the record, I had some issues converting bounding box 
Intersects queries to polygons with Solr 4.6.1. Polygon version found 
way more results than it should have. I upgraded to 4.8.0 (and to JTS 
1.13 from 1.12), and now the results are correct.


--Ere

6.5.2014 21.26, david.w.smi...@gmail.com kirjoitti:

Hi Era,

I appreciate the scattered documentation is confusing for users.  The use
of spatial for time durations is definitely not an official way to do it;
it’s clearly a hack/trick — one that works pretty well if you know the
issues to watch out for.  So I don’t see it getting documented on the
reference guide.  But, you should be happy to know about this:
https://issues.apache.org/jira/browse/LUCENE-5648  “Watch” that issue to
stay abreast of my development on it, and the inevitable Solr FieldType to
follow, and inevitable documentation in the reference guide.  With luck
it’ll get in by 4.9.

The “Intersects(POLYGON(…))” syntax is something I suggest using when you
have to — like when you have a polygon or linestring or if you are indexing
circles.  One of these days there will be a more Solr friendly query parser
— definitely for 4.something.  When that happens, it’ll get
deprecated/removed in trunk/5.

~ David

On Tue, May 6, 2014 at 4:22 AM, Ere Maijala  wrote:


David,

I made a note about your mentioning the deprecation below to take it into
account in our software, but now that I tried to find out more about this I
ran into some confusion since the Solr documentation regarding spatial
searches is currently quite badly scattered and partly obsolete [1]. I'd
appreciate some clarification on what exactly is deprecated. We're
currently using spatial for both time duration and geographic searches, and
in the latter we also use e.g. Intersects(POLYGON(...)) in addition. Is
this also deprecated and if so, how should I rewrite it? Thanks!

--Ere

[1] It would be really nice if it was possible to find up to date
documentation of at least all this in one place:

https://cwiki.apache.org/confluence/display/solr/Spatial+Search
https://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4
http://wiki.apache.org/solr/SpatialForTimeDurations
https://people.apache.org/~hossman/spatial-for-non-
spatial-meetup-20130117/
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/
201212.mbox/%3c1355027722156-4025434.p...@n3.nabble.com%3E

3.3.2014 20.12, Smiley, David W. kirjoitti:


The main reference for this approach is here:
http://wiki.apache.org/solr/SpatialForTimeDurations


Hoss’s illustrations he developed for the meetup presentation are great.
However, there are bugs in the instruction — specifically it’s important
to slightly buffer the query and choose an appropriate maxDistErr.  Also,
it’s more preferable to use the rectangle range query style of spatial
query (e.g. field:[“minX minY” TO “maxX maxY”] as opposed to using
“Intersects(minX minY maxX maxY)”.  There’s no technical difference but
the latter is deprecated and will eventually be removed from Solr 5 /
trunk.

All this said, recognize this is a bit of a hack (one that works well).
There is a good chance a more ideal implementation approach is going to be
developed this year.

~ David


On 3/1/14, 2:54 PM, "Shawn Heisey"  wrote:

  On 3/1/2014 11:41 AM, Thomas Scheffler wrote:



Am 01.03.14 18:24, schrieb Erick Erickson:


I'm not clear what you're really after here.

Solr certainly supports ranges, things like time:[* TO date_spec] or
date_field:[date_spec TO date_spec] etc.


There's also a really creative use of spatial (of all things) to, say
answer questions involving multiple dates per record. Imagine, for
instance, employees with different hours on different days. You can
use spatial to answer questions like "which employees are available
on Wednesday between 4PM and 8PM".

And if none of this is relevant, how about you give us some
use-cases? This could well be an XY problem.



Hi,

lets try this example to show the problem. You have some old text that
was written in two periods of time:

1.) 2nd half of 13th century: -> 1250-1299
2.) Beginning of 18th century: -> 1700-1715

You are searching for text that were written between 1300-1699, than
this document described above should not be hit.

If you make start date and end date multiple this results in:

start: [1250, 1700]
end: [1299, 1715]

A search for documents written between 1300-1699 would be:

(+start:[1300 TO 1699] +end:[1300-1699]) (+start:[* TO 1300] +end:[1300
TO *]) (+start:[*-1699] +end:[1700 TO *])

You see that the document above would obviously hit by "(+start:[* TO
1300] +end:[1300 TO *])"



This sounds exactly like the spatial use case that Erick just described.

http://wiki.apache.org/solr/SpatialForTimeDurations
https://people.apache.org/~hossma

Re: Difference between search strings

2014-05-15 Thread Jack Krupansky

Inside of quotes you only have to escape quote and backslash.

Add the debugQuery=true parameter to see exactly how Solr processes 
characters and generates queries.


But... in a URL you have to URL-encode URL query parameters:
http://en.wikipedia.org/wiki/Query_string

-- Jack Krupansky

-Original Message- 
From: nativecoder

Sent: Wednesday, May 14, 2014 9:15 AM
To: solr-user@lucene.apache.org
Subject: Difference between search strings

Can someone please tell me the difference between searching a text in the
following ways

1. q=Exact_Word:"samplestring" -> What does it tell to solr  ?

2. q=samplestring&qf=Exact_Word -> What does it tell to solr  ?

3. q="samplestring"&qf=Exact_Word -> What does it tell to solr  ?

I think the first and the third one are the same.  is it correct ? How does
it differ from the second one.

I am trying to understand how enclosing the full term in "" is resolving the
solr specific special character problem? What does it tell to solr  ? e.g If
there is "!" mark in the string solr will identify it as a NOT, "!" is part
of the string. This issue can be corrected if the full string is enclosed in
a "".






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-search-strings-tp4135571.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Solr, How to index scripts *.sh and *.SQL

2014-05-15 Thread Visser, Marc
HI All,
Recently I have set up an image with SOLR. My goal is to index and extract 
files on a Windows and Linux server. It is possible for me to index and extract 
data from multiple file types. This is done by the SOLR CELL request handler. 
See the post.jar cmd below.

j ava -Dauto -Drecursive -jar post.jar Y:\ SimplePostTool version 1.5 Posting 
files to base url localhost:8983/solr/update.. Entering auto mode. File endings 
considered are xml,json,csv,pdf,doc,docx,ppt,pp 
tx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log Entering recursive 
mode, max depth=999, delay=0s 0 files indexed.

Is it possible to index and extract metadata/content from file types like .sh 
and .sql? If it is possible I would like to know how of course :)



Greetings

Marc


Disclaimer
Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend bestemd voor 
de geadresseerde. Indien u niet de bedoelde ontvanger bent, wordt u verzocht de 
afzender te waarschuwen en dit bericht met eventuele bijlagen direct te 
verwijderen en/of te vernietigen. Het is niet toegestaan dit bericht en 
eventuele bijlagen te vermenigvuldigen, door te sturen, openbaar te maken, op 
te slaan of op andere wijze te gebruiken. Ordina N.V. en/of haar 
groepsmaatschappijen accepteren geen verantwoordelijkheid of aansprakelijkheid 
voor schade die voortvloeit uit de inhoud en/of de verzending van dit bericht.

This e-mail and any attachments are confidential and are solely intended for 
the addressee. If you are not the intended recipient, please notify the sender 
and delete and/or destroy this message and any attachments immediately. It is 
prohibited to copy, to distribute, to disclose or to use this e-mail and any 
attachments in any other way. Ordina N.V. and/or its group companies do not 
accept any responsibility nor liability for any damage resulting from the 
content of and/or the transmission of this message.


Re: Solrj problem

2014-05-15 Thread blach
Hello Shawn,
according to this : https://issues.apache.org/jira/browse/SOLR-5590

I understand that solrj is still depends on the old httpclient shipped with
android tools, and this is my problem too. KARL has made an patch, could you
please explain what that patch for, 

how can I use it in to solve my problem?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solrj-problem-tp4135030p4135048.html
Sent from the Solr - User mailing list archive at Nabble.com.


Sorting by custom function query

2014-05-15 Thread Emanuele Filannino
Hi there,

I'm running into some issues developing a custom function query using Solr 
3.6.2.
My goal is to be able to implement a custom sorting technique.

I have a field called daily_prices_str, it is a single value str.

Example:


2014-05-01:130 2014-05-02:130 2014-05-03:130 2014-05-04:130 2014-05-05:130 
2014-05-06:130 2014-05-07:130 2014-05-08:130 2014-05-09:130 2014-05-10:130 
2014-05-11:130 2014-05-12:130 2014-05-13:130 2014-05-14:130 2014-05-15:130 
2014-05-16:130 2014-05-17:130 2014-05-18:130 2014-05-19:130 2014-05-20:130 
2014-05-21:130 2014-05-22:130 2014-05-23:130 2014-05-24:130 2014-05-25:130 
2014-05-26:130 2014-05-27:130 2014-05-28:130 2014-05-29:130 2014-05-30:130 
2014-05-31:130 2014-06-01:130 2014-06-02:130 2014-06-03:130 2014-06-04:130 
2014-06-05:130 2014-06-06:130 2014-06-07:130 2014-06-08:130 2014-06-09:130 
2014-06-10:130 2014-06-11:130 2014-06-12:130 2014-06-13:130 2014-06-14:130 
2014-06-15:130 2014-06-16:130 2014-06-17:130 2014-06-18:130 2014-06-19:130 
2014-06-20:130 2014-06-21:130 2014-06-22:130 2014-06-23:130 2014-06-24:130 
2014-06-25:130 2014-06-26:130 2014-06-27:130 2014-06-28:130 2014-06-29:130 
2014-06-30:130 2014-07-01:130 2014-07-02:130 2014-07-03:130 2014-07-04:130 
2014-07-05:130 2014-07-06:130 2014-07-07:130 2014-07-08:130 2014-07-09:130 
2014-07-10:130 2014-07-11:130 2014-07-12:130 2014-07-13:130 2014-07-14:130 
2014-07-15:130 2014-07-16:130 2014-07-17:130 2014-07-18:130 2014-07-19:170 
2014-07-20:170 2014-07-21:170 2014-07-22:170 2014-07-23:170 2014-07-24:170 
2014-07-25:170 2014-07-26:170 2014-07-27:170 2014-07-28:170 2014-07-29:170 
2014-07-30:170 2014-07-31:170 2014-08-01:170 2014-08-02:170 2014-08-03:170 
2014-08-04:170 2014-08-05:170 2014-08-06:170 2014-08-07:170 2014-08-08:170 
2014-08-09:170 2014-08-10:170 2014-08-11:170 2014-08-12:170 2014-08-13:170 
2014-08-14:170 2014-08-15:170 2014-08-16:170 2014-08-17:170 2014-08-18:170 
2014-08-19:170 2014-08-20:170 2014-08-21:170 2014-08-22:170 2014-08-23:170 
2014-08-24:170 2014-08-25:170 2014-08-26:170 2014-08-27:170 2014-08-28:170 
2014-08-29:170 2014-08-30:170


As you can see the structure of the string is date:price.

Basically, I would like to parse the string to get the price for a particular 
period and sort by that price.
I’ve already developed the java plugin for the custom function query and I’m at 
the point where my code compiles, runs, executes, etc. Solr is happy with my 
code.

Example:
price(daily_prices_str,2015-01-01,2015-01-03)

If I run this query I can see the correct price in the score field:

/select?price=price(daily_prices_str,2015-01-01,2015-01-03)&q={!func}$price

One of the problems is that I cannot sort by function result.
If I run this query:

/select?price=price(daily_prices_str,2015-01-01,2015-01-03)&q={!func}$price&sort=$price+asc

I get a 404 saying that "sort param could not be parsed as a query, and is not 
a field that exists in the index: $price"
But it works with a workaround:

/select?price=sum(0,price(daily_prices_str,2015-01-01,2015-01-03))&q={!func}$price&sort=$price+asc

The main problem is that I cannot filter by range:

/select?price=sum(0,price(daily_prices_str,2015-1-1,2015-1-3))&q={!frange l=100 
u=400}$price

Maybe I'm going about this totally incorrectly?


This message has been scanned for malware by Websense. www.websense.com


Re: Please add me to Contributors Group

2014-05-15 Thread Stefan Matheis
Hey

I’ve added you, thanks for contributing :)

-Stefan  


On Tuesday, May 13, 2014 at 1:03 PM, Gireesh C. Sahukar wrote:

> Hi,
>  
> I'd like to be added to the contributors group. My wiki username is gireesh
>  
>  
> Thanks
>  
> Gireesh
>  



Re: Solr, How to index scripts *.sh and *.SQL

2014-05-15 Thread Alexei Martchenko
Same in Windows. just plain text files, no metadata, no headers.


alexei martchenko
Facebook  |
Linkedin|
Steam  |
4sq| Skype: alexeiramone |
Github  | (11) 9 7613.0966 |


2014-05-11 4:32 GMT-03:00 Gora Mohanty :

> On 8 May 2014 12:25, Visser, Marc  wrote:
> >
> > HI All,
> > Recently I have set up an image with SOLR. My goal is to index and
> extract files on a Windows and Linux server. It is possible for me to index
> and extract data from multiple file types. This is done by the SOLR CELL
> request handler. See the post.jar cmd below.
> >
> > j ava -Dauto -Drecursive -jar post.jar Y:\ SimplePostTool version 1.5
> Posting files to base url localhost:8983/solr/update.. Entering auto mode.
> File endings considered are xml,json,csv,pdf,doc,docx,ppt,pp
> tx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log Entering recursive
> mode, max depth=999, delay=0s 0 files indexed.
> >
> > Is it possible to index and extract metadata/content from file types
> like .sh and .sql? If it is possible I would like to know how of course :)
>
> Don't know about Windows, but on Linux these are just text files. What
> metadata are you referring to? Normally, a Linux text file only has
> content,
> unless you are talking about metadata such as obtained from:
>file cmd.sh
>
> Regards,
> Gora
>


Difference between search strings

2014-05-15 Thread rio
Can someone please tell me the difference between searching a text in the
following ways

1. q=Exact_Word:"samplestring" -> What does it tell to solr  ?

2. q=samplestring&qf=Exact_Word -> What does it tell to solr  ?

3. q="samplestring"&qf=Exact_Word -> What does it tell to solr  ?
 
I think the first and the third one are the same.  is it correct ? How does
it differ from the second one.

I am trying to understand how enclosing the full term in "" is resolving the
solr specific special character problem? What does it tell to solr  ? e.g If
there is "!" mark in the string solr will identify it as a NOT, "!" is part
of the string. This issue can be corrected if the full string is enclosed in
a "".





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Difference-between-search-strings-tp4135576.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solrj Default Data Format

2014-05-15 Thread Furkan KAMACI
Hmmm, I see that it is like XML format but not. I have added three
documents but has something like that:



id1
id2
id3
id4
d1
d2
d3
d4






is this javabin format? I mean optimizing XML and having a first byte of
"2"?

Thanks;
Furkan KAMACI


2014-05-07 22:04 GMT+03:00 Furkan KAMACI :

> Hi;
>
> I am testing Solrj. I use Solr 4.5.1 and HttpSolrServer for my test. I
> just generate some SolrInputDocuments and call add method of server to add
> them. When  I track the request I see that data is at XML format instead of
> javabin. Do I miss anything?
>
> Thanks;
> Furkan KAMACI
>


Short hangs when doing collection alias updates

2014-05-15 Thread Greg Walters
Good day list members.

I've got a couple solr clusters in cloud mode that make use of collection 
aliases for offline indexing that we rotate into when indexing is complete. 
Every time we rotate we see a huge jump in response time, a couple timeouts and 
a jump in threads. You can see the results of a rotate at 
http://i.imgur.com/W1iX5dw.png. It looks like solr is queuing requests it 
receives while in the middle of updating the alias then servicing them which 
explains the increase in threads along with response time. Is there a way to 
update an alias that's currently taking requests without the queuing and 
response time hit?

Thanks,
Greg

RE: Easises way to insatll solr cloud with tomcat

2014-05-15 Thread Matt Kuiper
Check out http://heliosearch.com/download.html  

This is a distribution of Apache Solr packaged with Tomcat.

I have found it simple to use.

Matt

-Original Message-
From: Aman Tandon [mailto:amantandon...@gmail.com] 
Sent: Monday, May 12, 2014 6:24 AM
To: solr-user@lucene.apache.org
Subject: Re: Easises way to insatll solr cloud with tomcat

Can anybody help me out??

With Regards
Aman Tandon


On Mon, May 12, 2014 at 1:24 PM, Aman Tandon wrote:

> Hi,
>
> I tried to set up solr cloud with jetty which works fine. But in our 
> production environment we uses tomcat so i need to set up the solr 
> cloud with the tomcat. So please help me out to how to setup solr 
> cloud with tomcat on single machine.
>
> Thanks in advance.
>
> With Regards
> Aman Tandon
>