Re: Mulitple facet - fq

2010-10-21 Thread Yavuz Selim YILMAZ
Thnx guys.
--

Yavuz Selim YILMAZ


2010/10/20 Tim Gilbert tim.gilb...@morningstar.com

 Sorry, what Pradeep said, not Prasad.  My apologies Pradeep.

 -Original Message-
 From: Tim Gilbert
 Sent: Wednesday, October 20, 2010 12:18 PM
 To: 'solr-user@lucene.apache.org'
 Subject: RE: Mulitple facet - fq

 As Prasad said:

fq=(category:corporate category:personal)

 But you might want to check your schema.xml to see what you have here:

!-- SolrQueryParser configuration: defaultOperator=AND|OR --
solrQueryParser defaultOperator=AND /

 You can always specify your operator in your search between your facets.


fq=(category:corporate AND category:personal)

 or

fq=(category:corporate OR category:personal)

 I have an application where I am using searches on 10 more facets with
 AND OR + and - options and it works flawlessly.

fq=(+category:corporate AND -category:personal)

 meaning category is corporate and not personal.

 Tim

 -Original Message-
 From: Pradeep Singh [mailto:pksing...@gmail.com]
 Sent: Wednesday, October 20, 2010 11:56 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Mulitple facet - fq

 fq=(category:corporate category:personal)

 On Wed, Oct 20, 2010 at 7:39 AM, Yavuz Selim YILMAZ
 yvzslmyilm...@gmail.com
  wrote:

  Under category facet, there are multiple selections, whicih can be
  personal,corporate or other 
 
  How can I get both personal and corporate ones, I tried
  fq=category:corporatefq=category:personal
 
  It looks easy, but I can't find the solution.
 
 
  --
 
  Yavuz Selim YILMAZ
 



Re: RAM increase

2010-10-21 Thread Gora Mohanty
On Thu, Oct 21, 2010 at 10:46 AM, satya swaroop satya.yada...@gmail.com wrote:
 Hi all,
              I increased my RAM size to 8GB and i want 4GB of it to be used
 for solr itself. can anyone tell me the way to allocate the RAM for the
 solr.
[...]

You will need to set up the allocation of RAM for Java, via the the -Xmx
and -Xms variables. If you are using something like Tomcat, that would
be done in the Tomcat configuration file. E.g., this option can be added
inside /etc/init.d/tomcat6 on new Debian/Ubuntu systems.

Regards,
Gora


Re:why sorl is slower than lucene so much?

2010-10-21 Thread kafka0102
I found the problem's cause.It's the DocSetCollector. my fitler query result's 
size is about 300,so the DocSetCollector.getDocSet() is OpenBitSet. And 
300 OpenBitSet.fastSet(doc) op is too slow. So I used SolrIndexSearcher's 
TopFieldDocs search(Query query, Filter filter, int n,
Sort sort), and it's normal.




At 2010-10-20 19:21:27,kafka0102 kafka0...@163.com wrote:

For solr's SolrIndexSearcher.search(QueryResult qr, QueryCommand cmd), I find 
it's too slowly.my index's size is about 500M, and record's num is 3984274.my 
query is like q=xxfq=fid:1fq=atm:[int_time1 TO int_time2].
fid's type is fieldType name=int class=solr.TrieIntField 
precisionStep=0 omitNorms=true positionIncrementGap=0/
atm's type is  fieldType name=sint class=solr.TrieIntField 
precisionStep=8 omitNorms=true positionIncrementGap=0/.
for the test, I closed solr's cache's config and used another lucene's code 
like bottom:

 private void test2(final ResponseBuilder rb) {
try {
  final SolrQueryRequest req = rb.req;
  final SolrIndexSearcher searcher = req.getSearcher();
  final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
  final ExecuteTimeStatics timeStatics = 
 ExecuteTimeStatics.getExecuteTimeStatics();
  final ExecuteTimeUnit staticUnit = 
 timeStatics.addExecuteTimeUnit(test2);
  staticUnit.start();
  final ListQuery query = cmd.getFilterList();
  final BooleanQuery booleanFilter = new BooleanQuery();
  for (final Query q : query) {
booleanFilter.add(new BooleanClause(q,Occur.MUST));
  }
  booleanFilter.add(new BooleanClause(cmd.getQuery(),Occur.MUST));
  logger.info(q:+query);
  final Sort sort = cmd.getSort();
  final TopFieldDocs docs = searcher.search(booleanFilter,null,20,sort);
  final StringBuilder sbBuilder = new StringBuilder();
  for (final ScoreDoc doc :docs.scoreDocs) {
sbBuilder.append(doc.doc+,);
  }
  logger.info(hits:+docs.totalHits+,result:+sbBuilder.toString());
  staticUnit.end();
} catch (final Exception e) {
  throw new RuntimeException(e);
}
  }

for the test, I first called above's code and then solr's search(...). The 
result is : lucence's about 20ms and solr's about 70ms.
I'm so confused.
And,I wrote another code using filter like bottom,but the range query's result 
num is not correct.
Can anybody knows the reasons?

  private void test1(final ResponseBuilder rb) {
try {
  final SolrQueryRequest req = rb.req;
  final SolrIndexSearcher searcher = req.getSearcher();
  final SolrIndexSearcher.QueryCommand cmd = rb.getQueryCommand();
  final ExecuteTimeStatics timeStatics = 
 ExecuteTimeStatics.getExecuteTimeStatics();
  final ExecuteTimeUnit staticUnit = 
 timeStatics.addExecuteTimeUnit(test1);
  staticUnit.start();
  final ListQuery query = cmd.getFilterList();
  final BooleanFilter booleanFilter = new BooleanFilter();
  for (final Query q : query) {
setFilter(booleanFilter,q);
  }
  final Sort sort = cmd.getSort();
  final TopFieldDocs docs = 
 searcher.search(cmd.getQuery(),booleanFilter,20,sort);
  logger.info(hits:+docs.totalHits);
 
  staticUnit.end();
} catch (final Exception e) {
  throw new RuntimeException(e);
}
  }




Using a custom repository to store solr index files

2010-10-21 Thread Tharindu Mathew
Hi everyone,

I was looking at using the Embedded Solr server through SolrJ and I
have a couple of concerns.

I'd like to use a custom repository to store my index. Is there a way
I can define this. Is there a data output interface I can implement
for this purpose?

Or can this be done in some way?

Any feedback is appreciated.

Thanks in advance.

-- 
Regards,

Tharindu


A bug in ComplexPhraseQuery ?

2010-10-21 Thread jmr

Hi,

We have installed ComplexPhraseQuery and since that we can see strange
behaviour in proximity search.

We have the 2 following queries:
(text:(protein digest~50))
(text:(digest protein~50))

Without ComplexPhraseQuery, both queries are returning 6 documents matching.
With ComplexPhraseQuery, query 1 returns 4 documents and query 2 returns 5
documents!

It seems that proximity search is broken. Is this a known problem ?

Thanks for your help.

Regards,
J-Michel
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1744659.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: RAM increase

2010-10-21 Thread Jean-Sebastien Vachon

You will also need to switch to a 64 bits JVM
You might have to add the `-d64` flag as well as the `-Xms` and `-Xmx`

- Original Message - 
From: Gora Mohanty g...@mimirtech.com

To: solr-user@lucene.apache.org
Sent: Thursday, October 21, 2010 2:34 AM
Subject: Re: RAM increase


On Thu, Oct 21, 2010 at 10:46 AM, satya swaroop satya.yada...@gmail.com 
wrote:

Hi all,
I increased my RAM size to 8GB and i want 4GB of it to be used
for solr itself. can anyone tell me the way to allocate the RAM for the
solr.

[...]

You will need to set up the allocation of RAM for Java, via the the -Xmx
and -Xms variables. If you are using something like Tomcat, that would
be done in the Tomcat configuration file. E.g., this option can be added
inside /etc/init.d/tomcat6 on new Debian/Ubuntu systems.

Regards,
Gora 



Re: Using a custom repository to store solr index files

2010-10-21 Thread Upayavira


On Thu, 21 Oct 2010 14:42 +0530, Tharindu Mathew mcclou...@gmail.com
wrote:
 Hi everyone,
 
 I was looking at using the Embedded Solr server through SolrJ and I
 have a couple of concerns.
 
 I'd like to use a custom repository to store my index. Is there a way
 I can define this. Is there a data output interface I can implement
 for this purpose?
 
 Or can this be done in some way?

Why do you want to do this?

Solr embeds a lucene index, and Lucene has a Directory interface, that
can be implemented differently (something other than the default
FSDirectory implementation).

Upayavira


MoreLikeThis explanation?

2010-10-21 Thread darren
Hi,
  Does the latest Solr provide an explanation for results returned by MLT?
I want to get the interesting terms for each result that overlap with the
source document. This set of terms will vary from result to result
possibly.

Thanks!
Darren


Re: Import From MYSQL database

2010-10-21 Thread virtas

You need to look into actual logs of the system. There you will see more
details why import failed. 

check tomcat or jetty logs
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Import-From-MYSQL-database-tp1738753p1745246.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: MoreLikeThis explanation?

2010-10-21 Thread Koji Sekiguchi

(10/10/21 20:33), dar...@ontrenet.com wrote:

Hi,
   Does the latest Solr provide an explanation for results returned by MLT?


No, but there is an open issue:

https://issues.apache.org/jira/browse/SOLR-860

Koji

--
http://www.rondhuit.com/en/


FieldCache

2010-10-21 Thread Mathias Walter
Hi,

does a field which should be cached needs to be indexed?

I have a binary field which is just stored. Retrieving it via 
FieldCache.DEFAULT.getTerms returns empty ByteRefs.

Then I found the following post: 
http://www.mail-archive.com/d...@lucene.apache.org/msg05403.html

How can I use the FieldCache with a binary field?

--
Kind regards,
Mathias



Re: why sorl is slower than lucene so much?

2010-10-21 Thread Yonik Seeley
2010/10/21 kafka0102 kafka0...@163.com:
 I found the problem's cause.It's the DocSetCollector. my fitler query 
 result's size is about 300,so the DocSetCollector.getDocSet() is 
 OpenBitSet. And 300 OpenBitSet.fastSet(doc) op is too slow.


As I said in my other response to you, that's a perfect reason why you
want Solr to cache that for you (unless the filter will be different
each time).

-Yonik
http://www.lucidimagination.com


Re: MoreLikeThis explanation?

2010-10-21 Thread Darren Govoni
Thank you!

On Thu, 2010-10-21 at 23:03 +0900, Koji Sekiguchi wrote:

 (10/10/21 20:33), dar...@ontrenet.com wrote:
  Hi,
 Does the latest Solr provide an explanation for results returned by MLT?
 
 No, but there is an open issue:
 
 https://issues.apache.org/jira/browse/SOLR-860
 
 Koji
 




Re: RAM increase

2010-10-21 Thread Jonathan Rochkind

Jean-Sebastien Vachon wrote:

You will also need to switch to a 64 bits JVM
You might have to add the `-d64` flag as well as the `-Xms` and `-Xmx`

  
I've actually had no luck googling what's up with the -d64.  Can you 
point me to any documentation on what effect it has, and on particular 
on what the boundary -Xmx size is that requires -d64?


Jonathan




Re: RAM increase

2010-10-21 Thread Dennis Gearon
Everything ovger ~3.7 3.7GB RAM (2^32, use your calculator) needs 64 bit 
addressing.

Dennis Gearon

Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'

EARTH has a Right To Life,
  otherwise we all die.


--- On Thu, 10/21/10, Jonathan Rochkind rochk...@jhu.edu wrote:

 From: Jonathan Rochkind rochk...@jhu.edu
 Subject: Re: RAM increase
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Date: Thursday, October 21, 2010, 9:56 AM
 Jean-Sebastien Vachon wrote:
  You will also need to switch to a 64 bits JVM
  You might have to add the `-d64` flag as well as the
 `-Xms` and `-Xmx`
  
    
 I've actually had no luck googling what's up with the
 -d64.  Can you point me to any documentation on what
 effect it has, and on particular on what the boundary -Xmx
 size is that requires -d64?
 
 Jonathan
 
 



DistributedSearchDesign and multiple requests

2010-10-21 Thread Jeff Wartes

I'm using Solr 1.4. My observations and this page 
http://wiki.apache.org/solr/DistributedSearchDesign#line-254 indicate that the 
general strategy for Distributed Search is something like:
1. Query the shards with the user's query and fl=unique_field,score
2. Re-query (maybe a subset of) the shards for certain documents by 
unique_field with the field list the user requested.
3. Maybe re-query the shards again to flesh out faceting info.

I'm encountering a significant performance penalty using DistributedSearch due 
to these additional queries, and it seems like there are some obvious 
optimizations that could avoid them in certain cases. 

For example, a way to say I claim the fields I'm requesting are small enough 
that querying again for stored fields is worse than just getting the stored 
fields in the first request. 
(assert_tiny_data=truefl=tiny_stored_field,unique_field) 
Or, If the field list of the original query is contained in the first round of 
shard requests, don't bother querying again for more fields. 
(fl=unique_field,score)

Has anyone else looked into this? I'd be interested to learn if there are 
issues that makes these kind of shortcuts difficult before I dig in.

Thanks,
  -Jeff Wartes


RE: RAM increase

2010-10-21 Thread Steven A Rowe
Memory limits info:

http://www.oracle.com/technetwork/java/hotspotfaq-138619.html#gc_heap_32bit

-d64 usage info:

http://stackoverflow.com/questions/1443677/what-impact-if-any-does-the-d64-swtich-have-on-sun-jvm-resident-memory-usage

Steve

 -Original Message-
 From: Dennis Gearon [mailto:gear...@sbcglobal.net]
 Sent: Thursday, October 21, 2010 1:08 PM
 To: solr-user@lucene.apache.org
 Subject: Re: RAM increase
 
 Everything ovger ~3.7 3.7GB RAM (2^32, use your calculator) needs 64 bit
 addressing.
 
 Dennis Gearon
 
 Signature Warning
 
 It is always a good idea to learn from your own mistakes. It is usually a
 better idea to learn from others’ mistakes, so you do not have to make
 them yourself. from
 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'
 
 EARTH has a Right To Life,
   otherwise we all die.
 
 
 --- On Thu, 10/21/10, Jonathan Rochkind rochk...@jhu.edu wrote:
 
  From: Jonathan Rochkind rochk...@jhu.edu
  Subject: Re: RAM increase
  To: solr-user@lucene.apache.org solr-user@lucene.apache.org
  Date: Thursday, October 21, 2010, 9:56 AM
  Jean-Sebastien Vachon wrote:
   You will also need to switch to a 64 bits JVM
   You might have to add the `-d64` flag as well as the
  `-Xms` and `-Xmx`
  
  
  I've actually had no luck googling what's up with the
  -d64.  Can you point me to any documentation on what
  effect it has, and on particular on what the boundary -Xmx
  size is that requires -d64?
 
  Jonathan
 
 
 


how well does multicore scale?

2010-10-21 Thread mike anderson
I'm exploring the possibility of using cores as a solution to bookmark
folders in my solr application. This would mean I'll need tens of thousands
of cores... does this seem reasonable? I have plenty of CPUs available for
scaling, but I wonder about the memory overhead of adding cores (aside from
needing to fit the new index in memory).

Thoughts?

-mike


[solrmarc-tech] JVM XX:+UseCompressedOops

2010-10-21 Thread Jonathan Rochkind

Is anyone using the newish JVM XX:+UseCompressedOops  with Solr?  Do you
have reason to believe it's helpful? Is there any way it can be harmful?

I am hoping it reduces my memory consumption somewhat.

An old thread with someone asking the same question, but with no
answers:
http://osdir.com/ml/solr-user.lucene.apache.org/2009-07/msg00663.html

--
You received this message because you are subscribed to the Google 
Groups solrmarc-tech group.

To post to this group, send email to solrmarc-t...@googlegroups.com.
To unsubscribe from this group, send email to 
solrmarc-tech+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/solrmarc-tech?hl=en.




multiple cores, solr.xml and replication

2010-10-21 Thread didier deshommes
Hi there,
I noticed that the java-based replication does not make replication of
multiple core  automatic. For example, if I have a master with 7
cores, any slave I set up has to explicitly know about each of the 7
cores to be able to replicate them. This information is stored in
solr.xml, and since this file is out of the conf/ directory, it's
impossible to make the java-based replication copy this file over each
slave. Is this by design? For those of you  doing multicore
replication, how do you handle it?

Is overwriting solr.xml when persist=true is used thread-safe? What
happens if I create 2 different cores at the same time? I ask because
I have 7 cores total and I always end with only 2 or 3 cores in my
solr.xml after doing a bulk delta-import across cores.

didier


Re: how well does multicore scale?

2010-10-21 Thread Jonathan Rochkind
No, it does not seem reasonable.  Why do you think you need a seperate 
core for every user? 


mike anderson wrote:

I'm exploring the possibility of using cores as a solution to bookmark
folders in my solr application. This would mean I'll need tens of thousands
of cores... does this seem reasonable? I have plenty of CPUs available for
scaling, but I wonder about the memory overhead of adding cores (aside from
needing to fit the new index in memory).

Thoughts?

-mike

  


Re: multiple cores, solr.xml and replication

2010-10-21 Thread Shawn Heisey

On 10/21/2010 1:42 PM, didier deshommes wrote:

I noticed that the java-based replication does not make replication of
multiple core automatic. For example, if I have a master with 7
cores, any slave I set up has to explicitly know about each of the 7
cores to be able to replicate them. This information is stored in
solr.xml, and since this file is out of the conf/ directory, it's
impossible to make the java-based replication copy this file over each
slave. Is this by design? For those of you  doing multicore
replication, how do you handle it?


My slave replication handler looks like this, used for all cores.  The 
solr.core.name parameter is dynamically replaced with the name of the 
current core:


requestHandler name=/replication class=solr.ReplicationHandler 
lst name=slave
str 
name=masterUrlhttp://HOST:8983/solr/${solr.core.name}/replication/str

str name=pollInterval00:00:15/str
/lst
/requestHandler

Shawn



Re: multiple cores, solr.xml and replication

2010-10-21 Thread didier deshommes
On Thu, Oct 21, 2010 at 3:00 PM, Shawn Heisey s...@elyograg.org wrote:
 On 10/21/2010 1:42 PM, didier deshommes wrote:

 I noticed that the java-based replication does not make replication of
 multiple core automatic. For example, if I have a master with 7
 cores, any slave I set up has to explicitly know about each of the 7
 cores to be able to replicate them. This information is stored in
 solr.xml, and since this file is out of the conf/ directory, it's
 impossible to make the java-based replication copy this file over each
 slave. Is this by design? For those of you  doing multicore
 replication, how do you handle it?

 My slave replication handler looks like this, used for all cores.  The
 solr.core.name parameter is dynamically replaced with the name of the
 current core:

I use this configuration too but doesn't this assume that solr.xml is
the same in master and slave? what happens when master creates a new
core?

didier


 requestHandler name=/replication class=solr.ReplicationHandler 
 lst name=slave
 str
 name=masterUrlhttp://HOST:8983/solr/${solr.core.name}/replication/str
 str name=pollInterval00:00:15/str
 /lst
 /requestHandler

 Shawn




OutOfMemory and auto-commit

2010-10-21 Thread Jonathan Rochkind
If I do _not_ have any auto-commit enabled, and add 500k documents and 
commit at end, no problem.


If I instead set auto-commit maxDocs to 10 (pretty large number), 
and try to add 500k docs, with autocommits theoretically happening every 
100k... I run into an OutOfMemory error.


Can anyone think of any reasons that would cause this, and how to 
resolve it? 

All I can think of is that in the first case, my newSearcher and 
firstSearcher warming queries don't run until the 'document add' is 
completely done. In the second case, there are newSearcher and 
firstSearcher warming queries happening at the same time another process 
is continuing to stream 'add's to Solr.   Although at a maxDocs of 
10, I shouldn't (I think) get _overlapping_ warming queries, the 
warming queries should be done before the next commit. I think. But 
nonetheless, just the fact that warming queries are happening at the 
same time 'add's are continuing to stream, could that be enough to 
somehow increase memory usage enough to run into OOM?


Re: A bug in ComplexPhraseQuery ?

2010-10-21 Thread Ahmet Arslan


--- On Thu, 10/21/10, jmr jmpala...@free.fr wrote:

 From: jmr jmpala...@free.fr
 Subject: A bug in ComplexPhraseQuery ?
 To: solr-user@lucene.apache.org
 Date: Thursday, October 21, 2010, 12:53 PM
 
 Hi,
 
 We have installed ComplexPhraseQuery and since that we can
 see strange
 behaviour in proximity search.
 
 We have the 2 following queries:
 (text:(protein digest~50))
 (text:(digest protein~50))
 
 Without ComplexPhraseQuery, both queries are returning 6
 documents matching.
 With ComplexPhraseQuery, query 1 returns 4 documents and
 query 2 returns 5
 documents!
 
 It seems that proximity search is broken. Is this a known
 problem ?

ComplexPhraseQuery is ordered phrase query where default Lucene's PhraseQuery 
is unordered. With ComplexPhrase order or terms are important.


  


Re: Multiple Similarity

2010-10-21 Thread Ahmet Arslan
 Is it possible to define different Similarity classes for
 different fields?

No. See http://search-lucene.com/m/g9cVf23EQO11/

 We have a use case where we are interested in avoid term
 frequency (tf) when
 our fields are multiValued.

May be omitTermFreqAndPositions=true?


  


Re: multiple cores, solr.xml and replication

2010-10-21 Thread Shawn Heisey

On 10/21/2010 2:14 PM, didier deshommes wrote:

I use this configuration too but doesn't this assume that solr.xml is
the same in master and slave? what happens when master creates a new
core?


That's a very good question, one that I can't answer.  I don't 
dynamically create new cores.  If you create the same core on the slave 
and its configuration includes that replication config, my expectation 
(until proven otherwise) would be that it should work.





Re: different results depending on result format

2010-10-21 Thread Mike Sokolov
quick follow-up: I also notice that the query from solrj gets version=1, 
whereas the admin webapp puts version=2.2 on the query string, although 
this param doesn't seem to change the xml results at all.  Does this 
indicate an older version of solrj perhaps?


-Mike

On 10/21/2010 04:47 PM, Mike Sokolov wrote:
I'm experiencing something really weird: I get different results 
depending on whether I specify wt=javabin, and retrieve using SolrJ, 
or wt=xml.  I spent quite a while staring at query params to make sure 
everything else is the same, and they do seem to be.  At first I 
thought the problem related to the javabin format change that has been 
talked about recently, but I am using solr 1.4.0 and solrj 1.4.0.


Notice in the two entries that the wt param is different and the hits 
result count is different.


Oct 21, 2010 4:22:19 PM org.apache.solr.core.SolrCore execute
INFO: [bopp.ba] webapp=/solr path=/select/ 
params={wt=xmlrows=20start=0facet=truefacet.field=ref_taxid_msq=*:*fl=uri,meta_ssversion=1} 
hits=261 status=0 QTime=1

Oct 21, 2010 4:22:28 PM org.apache.solr.core.SolrCore execute
INFO: [bopp.ba] webapp=/solr path=/select 
params={wt=javabinrows=20start=0facet=truefacet.field=ref_taxid_msq=*:*fl=uri,meta_ssversion=1} 
hits=57 status=0 QTime=0



The xml format results seem to be the correct ones. So one thought I 
had is that I could somehow fall back to using xml format in solrj, 
but I tried SolrQuery.set('wt','xml') and that didn't have the desired 
effect (I get 'wt=javabinwt=javabin' in the log - ie the param is 
repeated, but still javabin).



Am I crazy? Is this a known issue?

Thanks for any suggestions



Solr sorting problem

2010-10-21 Thread Moazzam Khan
Hey guys,

I have a list of people indexed in Solr. I am trying to sort by their
first names but I keep getting results that are not alphabetically
sorted (I see the names starting with W before the names starting with
A). I have a feeling that the results are first being sorted by
relevancy then sorted by first name.

Is there a way I can get the results to be sorted alphabetically?

Thanks,
Moazzam


Re: Solr sorting problem

2010-10-21 Thread Jayendra Patil
need additional information .
Sorting is easy in Solr just by passing the sort parameter

However, when it comes to text sorting it depends on how you analyse
and tokenize your fields
Sorting does not work on fields with multiple tokens.
http://wiki.apache.org/solr/FAQ#Why_Isn.27t_Sorting_Working_on_my_Text_Fields.3F

On Thu, Oct 21, 2010 at 7:24 PM, Moazzam Khan moazz...@gmail.com wrote:

 Hey guys,

 I have a list of people indexed in Solr. I am trying to sort by their
 first names but I keep getting results that are not alphabetically
 sorted (I see the names starting with W before the names starting with
 A). I have a feeling that the results are first being sorted by
 relevancy then sorted by first name.

 Is there a way I can get the results to be sorted alphabetically?

 Thanks,
 Moazzam



Strange file name after installing solr

2010-10-21 Thread Bac Hoang

 apache-solr-1.4.1Hello folks,

I'm very new user to solr. Please help

What I have in hand: 1) apache-solr-1.4.1; 2) Geronimo

After installing solr.war using Geronimo administration GUI, I got a 
strange file, under the 
opt/dev/ofwi-geronimo2.1.6/repository/default/*solr/1287558884961/solr-1287558884961.war. 
*Is this alright or any thing abnormal? My Geronimo says that solr 
running status, but when start, I got an error 
java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in 
classpath or 'solr/conf/', cwd=/opt/dev...


Thanks indeed for your time

With regards,
Bac Hoang




Re: OutOfMemory and auto-commit

2010-10-21 Thread Lance Norskog
Yes. Indexing activity suspends until the commit finishes, then
starts. Having both queries and indexing on the same Solr will have
this memory problem.

Lance

On Thu, Oct 21, 2010 at 1:16 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
 If I do _not_ have any auto-commit enabled, and add 500k documents and
 commit at end, no problem.

 If I instead set auto-commit maxDocs to 10 (pretty large number), and
 try to add 500k docs, with autocommits theoretically happening every 100k...
 I run into an OutOfMemory error.

 Can anyone think of any reasons that would cause this, and how to resolve
 it?
 All I can think of is that in the first case, my newSearcher and
 firstSearcher warming queries don't run until the 'document add' is
 completely done. In the second case, there are newSearcher and firstSearcher
 warming queries happening at the same time another process is continuing to
 stream 'add's to Solr.   Although at a maxDocs of 10, I shouldn't (I
 think) get _overlapping_ warming queries, the warming queries should be done
 before the next commit. I think. But nonetheless, just the fact that warming
 queries are happening at the same time 'add's are continuing to stream,
 could that be enough to somehow increase memory usage enough to run into
 OOM?




-- 
Lance Norskog
goks...@gmail.com


Re: how can i use solrj binary format for indexing?

2010-10-21 Thread Jason, Kim

Hi Gora, I really appreciate.
Your reply was a great help to me. :)
I hope everything is fine with you.

Regards,
Jason




Gora Mohanty-3 wrote:
 
 On Mon, Oct 18, 2010 at 8:22 PM, Jason, Kim hialo...@gmail.com wrote:
 
 Sorry for the delay in replying. Was caught up in various things this
 week.
 
 Thank you for reply, Gora

 But I still have several questions.
 Did you use separate index?
 If so, you indexed 0.7 million Xml files per instance
 and merged it. Is it Right?
 
 Yes, that is correct. We sharded the data by user ID, so that each of the
 25
 cores held approximately 0.7 million out of the 3.5 million records. We
 could
 have used the sharded indices directly for search, but at least for now
 have
 decided to go with a single, merged index.
 
 Please let me know how to work multiple instances and cores in your case.
 [...]
 
 * Multi-core Solr setup is quite easy, via configuration in solr.xml:
   http://wiki.apache.org/solr/CoreAdmin . The configuration, i.e.,
   schema, solrconfig.xml, etc. need to be replicated across the
   cores.
 * Decide which XML files you will post to which core, and do the
   POST with curl, as usual. You might need to write a little script
   to do this.
 * After indexing on the cores is done, make sure to do a commit
   on each.
 * Merge the sharded indexes (if desired) as described here:
   http://wiki.apache.org/solr/MergingSolrIndexes . One thing to
   watch out for here is disk space. When merging with Lucene
   IndexMergeTool, we found that a rough rule of thumb was that
   intermediate steps in the merge would require about twice as
   much space as the total size of the indexes to be merged. I.e.,
   if one is merging 40GB of data in sharded indexes, one should
   have at least 120GB free.
 
 Regards,
 Gora
 
 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/how-can-i-use-solrj-binary-format-for-indexing-tp1722612p1750669.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Using a custom repository to store solr index files

2010-10-21 Thread Tharindu Mathew
Thanks for your answer Upayavira. Appreciate it.

I want to do this because of a clustering requirement.

When clustering takes place in the product I'm working on the custom
repository we use replicates accordingly and makes data available to
all nodes. But if this is available on the file system this does not
happen.

So according to your answer I'll get the source and take a look at the
Directory interface.

On Thu, Oct 21, 2010 at 4:53 PM, Upayavira u...@odoko.co.uk wrote:


 On Thu, 21 Oct 2010 14:42 +0530, Tharindu Mathew mcclou...@gmail.com
 wrote:
 Hi everyone,

 I was looking at using the Embedded Solr server through SolrJ and I
 have a couple of concerns.

 I'd like to use a custom repository to store my index. Is there a way
 I can define this. Is there a data output interface I can implement
 for this purpose?

 Or can this be done in some way?

 Why do you want to do this?

 Solr embeds a lucene index, and Lucene has a Directory interface, that
 can be implemented differently (something other than the default
 FSDirectory implementation).

 Upayavira




-- 
Regards,

Tharindu


Re: how well does multicore scale?

2010-10-21 Thread Tharindu Mathew
Hi Mike,

I've also considered using a separate cores in a multi tenant
application, ie a separate core for each tenant/domain. But the cores
do not suit that purpose.

If you check out documentation no real API support exists for this so
it can be done dynamically through SolrJ. And all use cases I found,
only had users configuring it statically and then using it. That was
maybe 2 or 3 cores. Please correct me if I'm wrong Solr folks.

So your better off using a single index and with a user id and use a
query filter with the user id when fetching data.

On Fri, Oct 22, 2010 at 1:12 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 No, it does not seem reasonable.  Why do you think you need a seperate core
 for every user?
 mike anderson wrote:

 I'm exploring the possibility of using cores as a solution to bookmark
 folders in my solr application. This would mean I'll need tens of
 thousands
 of cores... does this seem reasonable? I have plenty of CPUs available for
 scaling, but I wonder about the memory overhead of adding cores (aside
 from
 needing to fit the new index in memory).

 Thoughts?

 -mike






-- 
Regards,

Tharindu


Re: A bug in ComplexPhraseQuery ?

2010-10-21 Thread jmr


iorixxx wrote:
 
 ComplexPhraseQuery is ordered phrase query where default Lucene's
 PhraseQuery is unordered. With ComplexPhrase order or terms are important.
 

Thanks for your answer.
With this request: (text:(protein digest~50)) || (text:(digest
protein~50))
I get my 6 documents.

In my opinion, ordering term in a proximity search does not make sense!
So the work around for us is to generate the opposite search every time a
proximity operator is used.
not very elegant!

Anyway, thaks again for the answer,
J-Michel
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/A-bug-in-ComplexPhraseQuery-tp1744659p1750748.html
Sent from the Solr - User mailing list archive at Nabble.com.