date:20110303

Use either the string fieldType or a field with very little analysis 
(KeywordTokenizer + LowercaseFilter).

 How to obtain perfect match with dismax query??
 
 es:
 
 i want to search hello i love you with deftype=dismax in the title field
 and i want to obtain results which title is exactly hello i love you with
 all this terms
 in this order.
 
 Not less words or other.
 how is it possilbe??
 
 i tryed with +(hello i love you) but if i have a title which is hello i
 love you mum it matches and i don't want!
 
 Thanx

Re: Selection Between Solr and Relational Database

Well, an RDBMS can be very fast but Solr using fq can be very fast as well. 
Just try fq=group:sportsfq=createdtime:you time

 Dear all,
 
 I have started to learn Solr for two months. At least right now, my system
 runs good in a Solr cluster.
 
 I have a question when implementing one feature in my system. When
 retrieving documents by keyword, I believe Solr is faster than relational
 database. However, if doing the following operations, I guess the
 performance must be lower. Is it right?
 
 What I am trying to do is listed as follows.
 
 1) All of the documents in Solr have one field which is used to
 differentiate them; different categories have different value in such a
 field, e.g., Group; the documents are classified as news, sports,
 entertainment and so on.
 
 2) Retrieve all of them documents by the field, Group.
 
 3) Besides the field of Group, another field called CreatedTime is also
 existed. I will filter the documents retrieved by Group according to the
 value of CreatedTime. The filtered documents are the final results I need.
 
 I guess the operation performance is lower than relational database, right?
 Could you please give me an explanation to that?
 
 Best regards,
 Li Bing

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m

why was this thread left unanswered ? Is there no way to achieve what the Op
had to say ?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624203.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Jan Høydahl

Hi,

This is simply great! Bravo!

This alone is worthy including, but I also (of course) have some comments/ideas:

The links section on top:
 * Move the links on top to bottom, reserving the top for navigation.
 * The send email could be changed to Community forum and instead of 
linking to 
   mailto:solr-user@lucene.apache.org, link to 
http://wiki.apache.org/solr/UsingMailingLists
 * Add a link to IRC chat. http://webchat.freenode.net/?channels=#solr
   That would surely increase the activity on the channel :)
 * Allow for custom links ala the admin-extra.html. Include html code from
   ${solr.solr.home}/admin-links.html - letting people add links to their own 
support etc.
 * Similarly for the top-section, allow including html code from
   ${solr.solr.home}/admin-navi.html - where you may add links to you Master 
Solr or whatever

Suggestion for new tabs for each core:
 * Prototyping - pointing to the /browse Velocity GUI. Very useful!!
 * CoreAdmin - Buttons reload core, remove core, rename...

In the System tab for each core, it would be great to show a number of key 
info:
 * # docs
 * Size of index (Mb)
 * Last add/delete timestamp
 * Optimized status (with a button to optimize now)
 * Button to reload core now (reloads config)

On the Query tab for each core:
 * Add a button Delete docs matching this query
   (With a JavaScript popup box are you sure? :)
 * Add an input box for query type, setting the qt param
 * Adding a some links below the input boxes, expanding by JavaScript:
   - dismax params
   - spatial params
   - spellcheck params
   - faceting params

Should there also be a tab above all cores, with host-wide stuff?
 * Solr version
 * Host name, port
 * Solr HOME path
 * Zookeeper info and link
 * Core Admin (create new core)

Improve the admin-extra.html concept:
Today, if the file admin-extra.html exists it will be included near
top of current admin GUI. This can be useful, but in this new design, it
perhaps makes more sense to include the admin-extra.html contents in
a widget box on each core. Then each organization can customize and put
links to their internal issue trackers etc..

Include a Dev/Test/Prod indication:
It is common to have three different environments, one for test, one for
development and one live production. It happens now and then that you do the
wrong action on the wrong server :( so a visual clue as to which environemnt
you're in is very useful.
I propose a simple solid bar on the very top which is RED for prod, YELLOW
for test and GREEN for dev. Would it be possible to read a Java system property
-Dsolr.environment=dev and based on that set the color of such a top-bar?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 2. mars 2011, at 21.47, Stefan Matheis wrote:

 Hi List,
 
 given that fact that my java-knowledge is sort of non-existing .. my idea was 
 to rework the Solr Admin Interface.
 
 Compared to CouchDBs Futon or the MongoDB Admin-Utils .. not that fancy, but 
 it was an idea few weeks ago - and i would like to contrib something, a thing 
 which has to be non-java but not useless - hopefully ;)
 
 Actually it's completly work-in-progress .. but i'm interested in what you 
 guys think. Right direction? Completly Wrong, just drop it?
 
 http://files.mathe.is/solr-admin/01_dashboard.png
 http://files.mathe.is/solr-admin/02_query.png
 http://files.mathe.is/solr-admin/03_schema.png
 http://files.mathe.is/solr-admin/04_analysis.png
 http://files.mathe.is/solr-admin/05_plugins.png
 
 It's actually using one index.jsp to generate to basic frame, including cores 
 and their navigation. Everything else is loaded via existing SolrAdminHandler.
 
 Any Questions, Ideas, Thoughts outta there? Please, let me know :)
 
 Regards
 Stefan

Re: perfect match in dismax search

2011-03-03 Thread Jan Høydahl

Hi,

I'm working on a Filter which enables boundary match using syntax title:^hello 
I love you$
which will make sure that the match is exact. See SOLR-1980 (no working patch 
yet)

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 3. mars 2011, at 11.07, Markus Jelsma wrote:

 Use either the string fieldType or a field with very little analysis 
 (KeywordTokenizer + LowercaseFilter).
 
 How to obtain perfect match with dismax query??
 
 es:
 
 i want to search hello i love you with deftype=dismax in the title field
 and i want to obtain results which title is exactly hello i love you with
 all this terms
 in this order.
 
 Not less words or other.
 how is it possilbe??
 
 i tryed with +(hello i love you) but if i have a title which is hello i
 love you mum it matches and i don't want!
 
 Thanx

Date range query with mixed inclusive/exclusive

2011-03-03 Thread Tim Terlegård

Is there any chance that
https://issues.apache.org/jira/browse/LUCENE-996 will be backported to
the 3x branch? I see that it's fixed in trunk, but it will be a while
until it's in a release.

How do people generally search for documents from lets say year 2009?
I thought it would be convenient to do something like:
publication:[2009-01-01T00:00:000Z TO 2010-01-01T00:00:000Z}

But now that there seems to be a bug that prevents this [...} kind of
search. So do people generally search like this?
publication:[2009-01-01T00:00:000Z TO 2009-12-31T23:59:999Z]

/Tim

Re: Solr TermsComponent: space in term

 Is there no way to achieve what the Op
 had to say ?
 

TermsComponent operates on indexed terms. One way to achieve multi-word 
suggestions is to use ShingleFilterFactory at index time.

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m


iorixxx wrote:
 
 TermsComponent operates on indexed terms. One way to achieve multi-word
 suggestions is to use ShingleFilterFactory at index time.
 

Thank you @iorixxx.
Could you point me where I can find a good docs on how to do this ?  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624429.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr TermsComponent: space in term

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory

On Thursday 03 March 2011 12:15:07 shrinath.m wrote:
 iorixxx wrote:
  TermsComponent operates on indexed terms. One way to achieve multi-word
  suggestions is to use ShingleFilterFactory at index time.
 
 Thank you @iorixxx.
 Could you point me where I can find a good docs on how to do this ?
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp189
 8889p2624429.html Sent from the Solr - User mailing list archive at
 Nabble.com.

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

adding a document using curl

2011-03-03 Thread Ken Foskey



I have read the various pages and used Curl a lot but i cannot figure out 
the correct command line to add a document to the example Solr instance.


I have tried a few things however they seem to be for the file on the same 
server as solr,  in my case I am pushing the document from a windows machine 
to Solr for indexing.


Ta
Ken

Re: adding a document using curl

Here's a complete example
http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

On Thursday 03 March 2011 12:31:11 Ken Foskey wrote:
 I have read the various pages and used Curl a lot but i cannot figure out
 the correct command line to add a document to the example Solr instance.
 
 I have tried a few things however they seem to be for the file on the same
 server as solr,  in my case I am pushing the document from a windows
 machine to Solr for indexing.
 
 Ta
 Ken

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr TermsComponent: space in term

2011-03-03 Thread shrinath.m


Markus Jelsma-2 wrote:
 
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
 
well, thank you Markus, 

Now My schema has the following : 














if I run a query like this : 

http://localhost:8983/solr/select?rows=0q=cfacet=truefacet.field=textfacet.mincount=1facet.prefix=com

I get output saying : 


1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1



how do I restrict it to only those words present in the documents and not
something like compliance w ?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624547.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: adding a document using curl

2011-03-03 Thread Ken Foskey

On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:
 Here's a complete example
 http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

I should have been clearer.   A rich text document,  XML I can make work
and a script is in the example docs folder

http://wiki.apache.org/solr/ExtractingRequestHandler

I also read the solr 1.4 book and tried samples in there,   could not
make them work.

Ta


 On Thursday 03 March 2011 12:31:11 Ken Foskey wrote:
  I have read the various pages and used Curl a lot but i cannot figure out
  the correct command line to add a document to the example Solr instance.
  
  I have tried a few things however they seem to be for the file on the same
  server as solr,  in my case I am pushing the document from a windows
  machine to Solr for indexing.
  
  Ta
  Ken

Re: adding a document using curl

2011-03-03 Thread pankaj bhatt

Hi All,
   is there any Custom open source SOLR ADMIN application like what
lucid imagination provides in its distribution.
   I am trying to create thing, however thinking it would be a
reinventing of wheel.

   Request you to please redirect me, if there is any open source
application that can be used.
   Waiting for your answer.

/ Pankaj Bhatt.

Custom SOLR ADMIN Application

2011-03-03 Thread pankaj bhatt

Hi All,
   is there any Custom open source SOLR ADMIN application like what
lucid imagination provides in its distribution.
   I am trying to create thing, however thinking it would be a
reinventing of wheel.

   Request you to please redirect me, if there is any open source
application that can be used.
   Waiting for your answer.

/ Pankaj Bhatt.

Re: AlternateDistributedMLT.patch not working

2011-03-03 Thread Edoardo Tosca

Hi all,
I am currently working on this AlternateDistributedMLT patch.
I've applied it manually on solr 1.4 an solved some Null Pointer Exception
issues.
It's now working properly.

But I'm not sure about its behaviour so i'll ask you, list:

I saw that every MLT query for a doc that is in the resultset runs only on
its shard (the one where the doc is in the index).
This means that you can miss documents, probably related to the doc but not
retrieved because they belong to other shards.

Does it make sense?
Is it the expected behavoiur?

If it is, i can submit the patch so then at least it works on solr 1.4.0

Thanks,

Edo


On Wed, Feb 23, 2011 at 6:53 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 Hi Isha,

 The patch is out of date.  You need to look at the patch and rejection and
 update your local copy of the code to match the logic from the patch, if
 it's
 still applicable to the version of Solr source code you have.

 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/



 - Original Message 
  From: Isha Garg isha.g...@orkash.com
  To: solr-user@lucene.apache.org
  Sent: Tue, February 22, 2011 2:13:23 AM
  Subject: AlternateDistributedMLT.patch not working
 
  Hello,
 
   I tried to use SOLR-788 with solr1.4 so that  distributed MLT works
 well .
 While working with this patch i got an error mesg  like
 
  1 out of 1 hunk FAILED -- saving rejects to file
 src/java/org/apache/solr/handler/component/MoreLikeThisComponent.java.rej
 
  Can  anybody help me out?
 
  Thanks!
  Isha Garg
 
 




-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com

Re: adding a document using curl

2011-03-03 Thread Gary Taylor


As an example, I run this in the same directory as the msword1.doc file:

curl 
http://localhost:8983/solr/core0/update/extract?literal.docid=74literal.type=5; 
-F file=@msword1.doc


The type literal is just part of my schema.

Gary.


On 03/03/2011 11:45, Ken Foskey wrote:

On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:

Here's a complete example
http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

I should have been clearer.   A rich text document,  XML I can make work
and a script is in the example docs folder

http://wiki.apache.org/solr/ExtractingRequestHandler

I also read the solr 1.4 book and tried samples in there,   could not
make them work.

Ta

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread mrw

Picture the URI field above the response field, only half-screen. This
facilitates breaking the query apart on different lines in order to debug
it.

When you have a lot of shards, fq clauses, etc., you end up with a very long
URI that is difficult to get your head around and manipulate. We take
queries from the logs, split them around parameters, take the shards out,
put the shards back in, take the OLS labels out, put them back in, etc.
With long, complex queries, it's essential to have a large work space to
play in. :)

Stefan Matheis wrote:

mrw,

you mean a field like here
(http://files.mathe.is/solr-admin/02_query.png) on the right side,
between meta-navigation and plain solr-xml response?

actually it's just to display the computed url, but if so .. we could
use a larger field for that, of course :)

Regards
Stefan

Am 02.03.2011 22:31, schrieb mrw:

Looks nice.

Might be also worth it to create a page with large query field for
pasting
in complete URL-encoded queries that cross cores, etc. I did that at
work
(via ASP.net) so we could paste in queries from logs and debug them. We
tend to use that quite a bit.

Cheers

Stefan Matheis wrote:

Hi List,

given that fact that my java-knowledge is sort of non-existing .. my
idea was to rework the Solr Admin Interface.

Compared to CouchDBs Futon or the MongoDB Admin-Utils .. not that fancy,
but it was an idea few weeks ago - and i would like to contrib
something, a thing which has to be non-java but not useless - hopefully
;)

Actually it's completly work-in-progress .. but i'm interested in what
you guys think. Right direction? Completly Wrong, just drop it?

http://files.mathe.is/solr-admin/01_dashboard.png
http://files.mathe.is/solr-admin/02_query.png
http://files.mathe.is/solr-admin/03_schema.png
http://files.mathe.is/solr-admin/04_analysis.png
http://files.mathe.is/solr-admin/05_plugins.png

It's actually using one index.jsp to generate to basic frame, including
cores and their navigation. Everything else is loaded via existing
SolrAdminHandler.

Any Questions, Ideas, Thoughts outta there? Please, let me know :)

Regards
Stefan

--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Admin-Interface-reworked-Go-on-Go-away-tp2620365p2620745.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Admin-Interface-reworked-Go-on-Go-away-tp2620365p2624956.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Understanding multi-field queries with q and fq

2011-03-03 Thread mrw

Yes, we're investigating dismax (with the qf param), but we're not sure it
supports our syntax needs.  The users want to put put AND/OR/NOT in their
queries, and we don't want to write a lot of code converting those queries
into dismax (+/-/mm) format.  So, until 3.1 (edismax) ships, we're also
trying to get boolean queries to work across multiple fields with the
standard query handler.

I've seen quite a few unanswered or partially-answered posts on this list on
getting boolean syntax right.  I can tell it's a thorny issue.


Robert Sandiford wrote:
 
 Have you looked at the 'qf' parameter?
 
 Bob Sandiford | Lead Software Engineer | SirsiDynix
 P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
 www.sirsidynix.com 
 _
 http://www.cosugi.org/ 
 
 
 
 
 -Original Message-
 From: mrw [mailto:mikerobertsw...@gmail.com]
 Sent: Wednesday, March 02, 2011 2:28 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Understanding multi-field queries with q and fq
 
 Anyone understand how to do boolean logic across multiple fields?
 
 Dismax is nice for searching multiple fields, but doesn't necessarily
 support our syntax requirements. eDismax appears to be not available
 until
 Solr 3.1.
 
 In the meantime, it looks like we need to support applying the user's
 query
 to multiple fields, so if the user enters led zeppelin merle we need
 to be
 able to do the logical equivalent of
 
 fq=field1:led zeppelin merle OR field2:led zeppelin merle
 
 
 Any ideas?  :)
 
 
 
 mrw wrote:
 
  After searching this list, Google, and looking through the Pugh book,
 I am
  a little confused about the right way to structure a query.
 
  The Packt book uses the example of the MusicBrainz DB full of song
  metadata.  What if they also had the song lyrics in English and
 German as
  files on disk, and wanted to index them along with the metadata, so
 that
  each document would basically have song title, artist, publisher,
 date,
  ..., All_Metadata (copy field of all metadata fields), Text_English,
 and
  Text_German fields?
 
  There can only be one default field, correct?  So if we want to
 search for
  all songs containing (zeppelin AND (dog OR merle)) do we
 
  repeat the entire query text for all three major fields in the 'q'
 clause
  (assuming we don't want to use the cache):
 
  q=(+All_Metadata:zeppelin AND (dog OR merle)+Text_English:zeppelin
 AND
  (dog OR merle)+Text_German:(zeppelin AND (dog OR merle))
 
  or repeat the entire query text for all three major fields in the
 'fq'
  clause (assuming we want to use the cache):
 
  q=*:*fq=(+All_Metadata:zeppelin AND (dog OR
 merle)+Text_English:zeppelin
  AND (dog OR merle)+Text_German:zeppelin AND (dog OR merle))
 
  ?
 
  Thanks!
 
 
 
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Understanding-multi-field-queries-
 with-q-and-fq-tp2528866p2619700.html
 Sent from the Solr - User mailing list archive at Nabble.com.
 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/Understanding-multi-field-queries-with-q-and-fq-tp2528866p2625068.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr TermsComponent: space in term



You need to remove EdgeNGramFilterFactory from your analyzer chain.



--- On Thu, 3/3/11, shrinath.m shrinat...@webyog.com wrote:

 From: shrinath.m shrinat...@webyog.com
 Subject: Re: Solr TermsComponent: space in term
 To: solr-user@lucene.apache.org
 Date: Thursday, March 3, 2011, 1:41 PM
 
 Markus Jelsma-2 wrote:
  
  http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ShingleFilterFactory
  
 well, thank you Markus, 
 
 Now My schema has the following : 
 
 
             
                 
                 
                 
                 
         
                 
                 
                 
             
         
 
 if I run a query like this : 
 
 http://localhost:8983/solr/select?rows=0q=cfacet=truefacet.field=textfacet.mincount=1facet.prefix=com
 
 I get output saying : 
 
 
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 
 
 
 how do I restrict it to only those words present in the
 documents and not
 something like compliance w ?
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-TermsComponent-space-in-term-tp1898889p2624547.html
 Sent from the Solr - User mailing list archive at
 Nabble.com.

Re: adding a document using curl

On Thu, Mar 3, 2011 at 5:15 PM, Ken Foskey kfos...@tpg.com.au wrote:
 On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:
 Here's a complete example
 http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

 I should have been clearer.   A rich text document,  XML I can make work
 and a script is in the example docs folder

 http://wiki.apache.org/solr/ExtractingRequestHandler

 I also read the solr 1.4 book and tried samples in there,   could not
 make them work.
[...]

Please provide details on what exactly is not working for you, and the
corresponding error message from the Solr logs. E.g., something like
I tried posting ABC document to Solr, using XYZ commands, and
include the part from the Solr logs relating to the exception that you
get. After that, further details might be needed, but without the above
it is nigh impossible to guess at what you are trying.

Regards,
Gora

Re: adding a document using curl

On Thu, Mar 3, 2011 at 5:31 PM, pankaj bhatt panbh...@gmail.com wrote:
 Hi All,
       is there any Custom open source SOLR ADMIN application like what
 lucid imagination provides in its distribution.
       I am trying to create thing, however thinking it would be a
 reinventing of wheel.

       Request you to please redirect me, if there is any open source
 application that can be used.
       Waiting for your answer.
[...]

Please do not hijack an existing thread, but start a new one if
you want to discuss a new topic. On Hoss' behalf :-)
http://people.apache.org/~hossman/#threadhijack

Regards,
Gora

Re: adding a document using curl

2011-03-03 Thread Jayendra Patil

If you are using the ExtractingRequestHandler, you can also try using
the stream.file or stream.url.

e.g. curl 
http://localhost:8080/solr/core0/update/extract?stream.file=C:/777045.zipliteral.id=777045literal.title=Testcommit=true;

More detailed explaination @
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika

The literal prefix attributes with normal fields and the content
extracted from the document is stored in the text field by default

Regards,
Jayendra

On Thu, Mar 3, 2011 at 7:16 AM, Gary Taylor g...@inovem.com wrote:
 As an example, I run this in the same directory as the msword1.doc file:

 curl
 http://localhost:8983/solr/core0/update/extract?literal.docid=74literal.type=5;
 -F file=@msword1.doc

 The type literal is just part of my schema.

 Gary.


 On 03/03/2011 11:45, Ken Foskey wrote:

 On Thu, 2011-03-03 at 12:36 +0100, Markus Jelsma wrote:

 Here's a complete example

 http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_parameters_as_part_of_the_URL

 I should have been clearer.   A rich text document,  XML I can make work
 and a script is in the example docs folder

 http://wiki.apache.org/solr/ExtractingRequestHandler

 I also read the solr 1.4 book and tried samples in there,   could not
 make them work.

 Ta

error in log INFO org.apache.solr.core.SolrCore - webapp=/solr path=/admin/ping params={} status=0 QTime=1

2011-03-03 Thread Mike Franon

I am using solr under jboss, so this might be more of a jboss config
issue, not really sure.  But my logs keep getting spammed, because
solr sends it as ERROR [STDERR] INFO org.apache.solr.core.SolrCore -
webapp=/solr path=/admin/ping params={} status=0 QTime=1

Has anyone seen this and found a workaround to not send this as an Error?

Thanks,
Mike

Content-Type of XMLResponseWriter / QueryResponseWriter

2011-03-03 Thread Bernd Fehling

Dear list,

is there any deeper logic behind the fact that XMLResponseWriter
is sending CONTENT_TYPE_XML_UTF8=application/xml; charset=UTF-8 ?

I would assume (and also most browser) that for XML Output
to receive text/xml and not application/xml.

Or do you want the browser to call and XML-Editor with the result?

Best regards, Bernd

deletedPKQuery does not perform with compound PK

2011-03-03 Thread Jérôme Droz


Hello,

I'm using a DIH to import documents from a database. Documents in the 
index represent a relationship between two entities, units and 
dealpoints (unit has dealpoint); thus document keys in the index refer 
to a compound SQL key. Full import works fine. In order to optimize the 
import process, I configured both the database and DIH configuration 
file for delta-import.


I added 3 more tables, updated by triggers: a table tracking 
modification time of units, another one tracking modification time of 
dealpoints, and the last one used to track deleted units having a 
dealpoint.


The uniqueKey field of the schema is defined as follows:

field name=id type=string indexed=true stored=true 
required=true multiValued=false /

...
uniqueKeyid/uniqueKey

Keys are generated by concatenating the unit id and the dealpoint id, 
separated by '-', in the SQL query.


Below is a sample of the data-config.xml I'm using (the original one is 
quite huge and may be confusing):


dataConfig
dataSourcedriver=com.mysql.jdbc.Driver
url=jdbc:mysql://somehost:3306/somedatabase
user=user
password=** /
document name=unitdealpoints
entity name=unitdealpoint pk=unit_id,dealpoint_id
query=select concat_ws('-', cast(u.unit_id as char), 
cast(dp.deal_point_id as char)) as id, ...

from unit u, deal_point dp, ... where ...
deltaQuery=select us.unit_id as unit_id, dps.deal_point_id 
as dealpoint_id
from unit_state us, deal_point_state dps where 
us.unit_state_last_mod gt; '${dataimporter.last_index_time}' or 
dps.deal_point_state_last_mod gt; '${dataimporter.last_index_time}'
deltaImportQuery=select concat_ws('-', cast(u.unit_id as 
char), cast(dp.deal_point_id as char)) as id, ...
from unit u, deal_point dp, ... where (u.unit_id = 
'${dataimporter.delta.unit_id}' or dp.deal_point_id = 
'${dataimporter.delta.dealpoint_id}') and ...

deletedPKQuery=select id from unit_deal_point_delete
...
/entity
/document
/dataConfig

I specifically choose to track deleted entities in a dedicated 
(unit_deal_point_delete) table in order to prevent the known (and 
apparently unsolved) bugs described here:

https://issues.apache.org/jira/browse/SOLR-1229?focusedCommentId=12722427page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12722427

The id field in the unit_deal_point_delete table has the exact same 
representation as the document keys. Below is an example of a trigger:


create trigger unit_delete_before before delete on unit
for each row
begin
insert ignore into unit_deal_point_delete (id) select 
concat_ws('-', cast(old.unit_id as char), cast(dpu.deal_point_id as 
char)) from deal_point_unit dpu where dpu.unit_id = old.unit_id;

end;

Delta and delta-import queries works fine, but the deletedPKQuery seems 
to always return 0 rows, although the unit_deal_point_delete table is 
obviously not empty. No errors written in the logs, but:


Mar 3, 2011 11:23:49 AM org.apache.solr.handler.dataimport.DocBuilder 
collectDelta

INFO: Completed DeletedRowKey for Entity: unitdealpoints rows obtained : 0

I have tested it with versions 1.4.0  1.4.1 and the result is the same: 
documents are not deleted.


What is the problem? Am I missing something?

Kind regards
--
Jerome Droz

Why is SolrDispatchFilter using 90% of the Time?

2011-03-03 Thread Stijn Vanhoorelbeke

Hi,

I'm working with a recent NightlyBuild of Solr and I'm doing some serious
ZooKeeper testing.
I've NewRelic monitoring enabled on my solr machines.

When I look at the distribution of the Response-time I notice
'SolrDispatchFilter.doFilter()' is taking up 90% of the time.
The other 10% is used by SolrSeacher and the QueryComponent.

+ Can anyone explain me why SolrDispatchFilter is consuming so much time?
++ Can I do something to lower this number?
 ( After all SolrDispatchFilter must Dispatch each time to the standard
searcher. )

Stijn Vanhoorelbeke

uniqueKey merge documents on commit

2011-03-03 Thread Tim Gilbert

Hi,

 

I have a unique key within my index, but rather than the default
behavour of overwriting I am wondering if there is a method to merge
the two different documents on commit of the second document.  I have a
testcase which explains what I'd like to happen:

 

@Test

  public void testMerge() throws SolrServerException, IOException

  {

SolrInputDocument doc1 = new SolrInputDocument();

doc1.addField(secid, testid);

doc1.addField(value1_i, 1);



SolrAllSec.GetSolrServer().add(doc1);

SolrAllSec.GetSolrServer().commit();



SolrInputDocument doc2 = new SolrInputDocument();

doc2.addField(secid, testid);

doc2.addField(value2_i, 2);

 

SolrAllSec.GetSolrServer().add(doc2);

SolrAllSec.GetSolrServer().commit();



SolrQuery solrQuery = new  SolrQuery();

solrQuery = solrQuery.setQuery(secid:testid);

QueryResponse response =
SolrAllSec.GetSolrServer().query(solrQuery, METHOD.GET);



ListSolrDocument result = response.getResults();

Assert.isTrue(result.size() == 1);

Assert.isTrue(result.contains(value1));

Assert.isTrue(result.contains(value2));

  } 

 

Other than reading doc1 and adding the fields from doc2 and
recommitting, is there another way?

 

Thanks in advance,

 

Tim

Re: Content-Type of XMLResponseWriter / QueryResponseWriter

2011-03-03 Thread Walter Underwood

Never use text/xml, that overrides any encoding declaration inside the XML file.

http://ln.hixie.ch/?start=1037398795count=1
http://www.grauw.nl/blog/entry/489

wunder
==
Lead Engineer, MarkLogic

On Mar 3, 2011, at 7:30 AM, Bernd Fehling wrote:

 Dear list,
 
 is there any deeper logic behind the fact that XMLResponseWriter
 is sending CONTENT_TYPE_XML_UTF8=application/xml; charset=UTF-8 ?
 
 I would assume (and also most browser) that for XML Output
 to receive text/xml and not application/xml.
 
 Or do you want the browser to call and XML-Editor with the result?
 
 Best regards, Bernd

Omit hour-min-sec in search?

2011-03-03 Thread bbarani

Hi,

Is there a way to omit hour-min-sec in SOLR date field during search?

I have indexed a field using TrieDateField and seems like it uses UTC
format. The dates get stored as below,

lastupdateddate2008-02-26T20:40:30.94Z

I want to do a search based on just -MM-DD and omit T20:40:30.94Z.. Not
sure if its feasible, just want to check if its possible.

Also most of the data in our source doesnt have time information hence we
are very much interested in just storing the date without time or even if
its stored with some default timestamp we want to search just using date
without using the timestamp.

Thanks,
Barani



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Omit-hour-min-sec-in-search-tp2625840p2625840.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Stefan Matheis

Hey Jan,

On Thu, Mar 3, 2011 at 11:37 AM, Jan Høydahl jan@cominvent.com wrote:
 This alone is worthy including, but I also (of course) have some 
 comments/ideas: [...]

Really nice! i'll try to make a list of open todos / missing items and
attach it to the JIRA-Ticket. Especially for the dismax- 
spatial-query-params, i would need some information about (not used
until now) - but i think these are smaller problems, regarding the
complete task :

Regards
Stefan

Re: SolrJ Tutorial

2011-03-03 Thread Bing Li

Dear Lance,

Could you tell me where I can find the unit tests code?

I appreciate so much for your help!

Best regards,
LB

On Sat, Jan 22, 2011 at 3:58 PM, Lance Norskog goks...@gmail.com wrote:

 The unit tests are simple and show the steps.

 Lance

 On Fri, Jan 21, 2011 at 10:41 PM, Bing Li lbl...@gmail.com wrote:
  Hi, all,
 
  In the past, I always used SolrNet to interact with Solr. It works great.
  Now, I need to use SolrJ. I think it should be easier to do that than
  SolrNet since Solr and SolrJ should be homogeneous. But I cannot find a
  tutorial that is easy to follow. No tutorials explain the SolrJ
 programming
  step by step. No complete samples are found. Could anybody offer me some
  online resources to learn SolrJ?
 
  I also noticed Solr Cell and SolrJ POJO. Do you have detailed resources
 to
  them?
 
  Thanks so much!
  LB
 



 --
 Lance Norskog
 goks...@gmail.com

Re: Omit hour-min-sec in search?

2011-03-03 Thread Shane Perry

Not sure if there is a means of doing explicitly what you ask, but you
could do a date range:

+mydate:[-MM-DD 0:0:0 TO -MM-DD 11:59:59]

On Thu, Mar 3, 2011 at 9:14 AM, bbarani bbar...@gmail.com wrote:
 Hi,

 Is there a way to omit hour-min-sec in SOLR date field during search?

 I have indexed a field using TrieDateField and seems like it uses UTC
 format. The dates get stored as below,

 lastupdateddate2008-02-26T20:40:30.94Z

 I want to do a search based on just -MM-DD and omit T20:40:30.94Z.. Not
 sure if its feasible, just want to check if its possible.

 Also most of the data in our source doesnt have time information hence we
 are very much interested in just storing the date without time or even if
 its stored with some default timestamp we want to search just using date
 without using the timestamp.

 Thanks,
 Barani



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Omit-hour-min-sec-in-search-tp2625840p2625840.html
 Sent from the Solr - User mailing list archive at Nabble.com.

FilterQuery OR statement

2011-03-03 Thread Tanner Postert

Trying to figure out how I can run something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this: fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.

Re: Dismax, q, q.alt, and defaultSearchField?

2011-03-03 Thread mrw

Thanks, Jan.

It looks like we need to do is use both q and q.alt, such that q.alt is
always *:* and q is either empty for filter-only queries, or has the user
text. That seems to work.

Jan Høydahl / Cominvent wrote:

Hi,

Try
q.alt={!dismax}banana

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 2. mars 2011, at 23.06, mrw wrote:

We have two banks of Solr nodes with identical schemas. The data I'm
searching for is in both banks.

One has defaultSearchField set to field1, the other has
defaultSearchField
set to field2.

We need to support both user queries and facet queries that have no user
content. For the latter, it appears I need to use q.alt=*:*, so I am
investigating also using q.alt for user content (e.g., q.alt=banana).

I run the following query:

q.alt=banana
defType=dismax
mm=1
tie=0.1
qf=field1+field2

On bank one, I get the expected results, but on bank two, I get 0
results.

I noticed (via debugQuery=true), that when I use q.alt, it resolves using
the defaultSearchField (e.g., field1:banana), not the value of the qf
param.
Therefore, I get different results.

If I switched to using q for user queries and q.alt for facet queries, I
would still get different results, because q would resolve against the
fields in the qf param, and q.alt would resolve against the default
search
field.

Is there a way to override this behavior in order to get consistent
results?

Thanks!

--
View this message in context:
http://lucene.472066.n3.nabble.com/Dismax-q-q-alt-and-defaultSearchField-tp2621061p2621061.html
Sent from the Solr - User mailing list archive at Nabble.com.

--
View this message in context:
http://lucene.472066.n3.nabble.com/Dismax-q-q-alt-and-defaultSearchField-tp2621061p2627134.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: FilterQuery OR statement

 Trying to figure out how I can run
 something similar to this for the fq
 parameter
 
 Field1 in ( 1, 2, 3 4 )
 AND
 Field2 in ( 4, 5, 6, 7 )
 
 I found some examples on the net that looked like this:
 fq=+field1:(1 2 3
 4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7)

Re: uniqueKey merge documents on commit

2011-03-03 Thread Jonathan Rochkind


Nope, there is not.

On 3/3/2011 10:55 AM, Tim Gilbert wrote:

Hi,



I have a unique key within my index, but rather than the default
behavour of overwriting I am wondering if there is a method to merge
the two different documents on commit of the second document.  I have a
testcase which explains what I'd like to happen:



@Test

   public void testMerge() throws SolrServerException, IOException

   {

 SolrInputDocument doc1 = new SolrInputDocument();

 doc1.addField(secid, testid);

 doc1.addField(value1_i, 1);



 SolrAllSec.GetSolrServer().add(doc1);

 SolrAllSec.GetSolrServer().commit();



 SolrInputDocument doc2 = new SolrInputDocument();

 doc2.addField(secid, testid);

 doc2.addField(value2_i, 2);



 SolrAllSec.GetSolrServer().add(doc2);

 SolrAllSec.GetSolrServer().commit();



 SolrQuery solrQuery = new  SolrQuery();

 solrQuery = solrQuery.setQuery(secid:testid);

 QueryResponse response =
SolrAllSec.GetSolrServer().query(solrQuery, METHOD.GET);



 ListSolrDocument  result = response.getResults();

 Assert.isTrue(result.size() == 1);

 Assert.isTrue(result.contains(value1));

 Assert.isTrue(result.contains(value2));

   }



Other than reading doc1 and adding the fields from doc2 and
recommitting, is there another way?



Thanks in advance,



Tim

Re: FilterQuery OR statement

--- On Thu, 3/3/11, Ahmet Arslan iori...@yahoo.com wrote:

 From: Ahmet Arslan iori...@yahoo.com
 Subject: Re: FilterQuery OR statement
 To: solr-user@lucene.apache.org
 Date: Thursday, March 3, 2011, 8:05 PM
  Trying to figure out how I can
 run
  something similar to this for the fq
  parameter

  Field1 in ( 1, 2, 3 4 )
  AND
  Field2 in ( 4, 5, 6, 7 )

  I found some examples on the net that looked like
 this:
  fq=+field1:(1 2 3
  4) +field2(4 5 6 7) but that yields no results.

 May be your default operator is set to AND in schema.xml?
 If yes, try using +field2(4 OR 5 OR 6 OR 7) 

Actually you can use local params for that.
http://wiki.apache.org/solr/LocalParams

fq={!q.op=OR df=field1}1 2 3 4fq={!q.op=OR df=field2}4 5 6 7

Re: FilterQuery OR statement

2011-03-03 Thread Tanner Postert

That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan iori...@yahoo.com wrote:

  Trying to figure out how I can run
  something similar to this for the fq
  parameter
 
  Field1 in ( 1, 2, 3 4 )
  AND
  Field2 in ( 4, 5, 6, 7 )
 
  I found some examples on the net that looked like this:
  fq=+field1:(1 2 3
  4) +field2(4 5 6 7) but that yields no results.

 May be your default operator is set to AND in schema.xml?
 If yes, try using +field2(4 OR 5 OR 6 OR 7)

Location of Main Class in Solr?

2011-03-03 Thread Anurag

I searched SolrIndexSearcher.java file but there is no main class.  I wanted
to know as to where this class resides. Can i call this main class (if it
exists)  using command line options in terminal , rather than through war
file?

-
Kumar Anurag

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Location-of-Main-Class-in-Solr-tp2627576p2627576.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Admin Interface, reworked - Go on? Go away?

2011-03-03 Thread Stefan Matheis


Am 02.03.2011 23:48, schrieb Robert Muir:

On Wed, Mar 2, 2011 at 5:34 PM, Stefan Matheis
matheis.ste...@googlemail.com  wrote:

Robert,

even in this WIP-State? if so .. i'll try one tomorrow evening after work



Its totally up to you, sometimes it can be useful to upload a partial
or WIP solution to an issue: as Hoss mentioned its a good way to get
feedback and additional ideas while you work.


There you go :) https://issues.apache.org/jira/browse/SOLR-2399

mixing version of solr

2011-03-03 Thread Ofer Fort

Hey all,
I have a master slave using the same index folder, the master only writes,
and the slave only reads.
Is it possible to use different versions of solr for those two servers?
Let's say i want to gain from the improved search speed of solr4.0 but since
it's my production system, am not willing to index using it since it's not a
stable release.
Since the slave only reads, if it will crash i'll just restart it.

Can i index using solr 1.4.1 and read the same index with solr 4.0?

thanks

Re: mixing version of solr

2011-03-03 Thread Frederik Kraus

No, that won't work as the index format has changed.
On Donnerstag, 3. März 2011 at 20:03, Ofer Fort wrote: 
 Hey all,
 I have a master slave using the same index folder, the master only writes,
 and the slave only reads.
 Is it possible to use different versions of solr for those two servers?
 Let's say i want to gain from the improved search speed of solr4.0 but since
 it's my production system, am not willing to index using it since it's not a
 stable release.
 Since the slave only reads, if it will crash i'll just restart it.
 
 Can i index using solr 1.4.1 and read the same index with solr 4.0?
 
 thanks

Limiting on dates in Solr

2011-03-03 Thread Steve Lewis

I am treating Solr as a NoSQL db that has great search capabilities. I am 
querying on a few fields:

1. text (default)
2. type (my own string field)
3. calibration (my own date field)

I'd like to limit the results to only show the calibration using this query:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]

This mostly works, but a couple of different dates (March 5) seep into the 
March 
3rd results. Is there any way to exclude the other dates, or at least have them 
return a lower ranking in the search? I've also tried:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]  AND NOT ( 
calibration:[* TO 2011-03-03T00:00:00.000Z] OR 
calibration:[2011-03-03T59:59:99.999Z TO *])

Which I found suggested on the stackoverflow web site. I've googled a good bit 
and nothing seems to be jumping out at me. No one else appears to be trying to 
do something similar, so I may just have unrealistic expectations of what a 
search engine will do.

Thanks in advance!
Steve

Re: FilterQuery OR statement

2011-03-03 Thread Jonathan Rochkind

You might also consider splitting your two seperate AND clauses into 
two seperate fq's:


fq=field1:(1 OR 2 OR 3 OR 4)
fq=field2:(4 OR 5 OR 6 OR 7)

That will cache the two seperate clauses seperately in the field cache, 
which is probably preferable in general, without knowing more about your 
use characteristics.


ALSO, instead of either supplying the OR explicitly as above, OR 
changing the default operator in schema.xml for everything, I believe it 
would work to supply it as a local param:


fq={q.op=OR}field1:(1 2 3 4)

If you want to do that.

AND, your question, can you search without a 'q'?  No, but you can 
search with a 'q' that selects all documents, to be limited by the fq's.


q=[* TO *]

On 3/3/2011 1:14 PM, Tanner Postert wrote:

That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslaniori...@yahoo.com  wrote:


Trying to figure out how I can run
something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this:
fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.

May be your default operator is set to AND in schema.xml?
If yes, try using +field2(4 OR 5 OR 6 OR 7)

Fwd: [Announce] Now Open: Call for Participation for ApacheCon North America

2011-03-03 Thread Grant Ingersoll

Begin forwarded message:

 From: Sally Khudairi s...@apache.org
 Date: March 3, 2011 3:10:17 PM EST
 To: annou...@apachecon.com
 Subject: [Announce] Now Open: Call for Participation for ApacheCon North 
 America
 Reply-To: s...@apache.org

 Call for Participation 
 ApacheCon North America 2011 
 7-11 November 2011 
 Westin Bayshore, Vancouver, Canada 

 All submissions must be received by Friday, 29 April 2011 at midnight Pacific 
 Time. 

 ApacheCon, the official conference, trainings, and expo of The Apache 
 Software Foundation (ASF), heads to Vancouver, Canada, this November, with 
 dozens of technical, business, and community-focused sessions for beginner, 
 intermediate, and expert audiences. 

 Now in its 11th year, the ASF develops and shepherds nearly 150 Top-Level 
 Projects and new initiatives in the Apache Incubator and Labs. With hundreds 
 of thousands of applications deploying ASF products and code contributions by 
 more than 2,500 Committers from around the world, the Apache community is 
 recognized as among the most robust, successful, and respected in Open 
 Source. 

 This year's ApacheCon focuses on highly-relevant, professionally-directed 
 presentations that demonstrate specific problems and real-world solutions. We 
 welcome proposals --from developers and users alike-- in the areas of Apache 
 and ...: 

 ... Enterprise Solutions (from ActiveMQ to Axis2 to ServiceMix, OFBiz to 
 Chemistry, the gang's all here!) 

 ... Cloud Computing (Hadoop, Cassandra, HBase, CouchDB, and friends) 

 ... Emerging Technologies + Innovation (Incubating projects such as Libcloud, 
 Stonehenge, and Wookie) 

 ... Community Leadership (mentoring and meritocracy, GSoC and related 
 initiatives) 

 ... Data Handling, Search + Analytics (Lucene, Solr, Mahout, OODT, Hive and 
 friends) 

 ... Pervasive Computing (Felix/OSGi, Tomcat, MyFaces Trinidad, and friends) 

 ... Servers, Infrastructure + Tools (HTTP Server, SpamAssassin, Geronimo, 
 Sling, Wicket and friends) 

 Submissions are open to anyone with relevant expertise: ASF affiliation is 
 not required to present at, attend, or otherwise participate in ApacheCon. 

 Whilst we encourage submissions that the highlight the use of specific Apache 
 solutions, we are unable to accept marketing/commercially-oriented 
 presentations. 

 Other proposals, such as panels, have been considered in the past; you are 
 welcome to submit an alternate presentation, however, such sessions are 
 accepted under exceptional circumstances. Please be as descriptive as 
 possible, including names/bios of proposed panelists and any related details. 

 Accepted speakers (not co-presenters) qualify for general conference 
 admission and a minimum of two nights lodging at the conference hotel. 
 Additional hotel nights and travel assistance are possible, depending on the 
 number of presentations given and type of assistance needed. 

 To submit a presentation proposal, please complete our ONLINE SUBMISSION FORM 
 at http://na11.apachecon.com/proposals/new 

 To be considered, proposals must be received by Friday, 29 April 2011 at 
 midnight Pacific Time. Please email any questions regarding proposal 
 submissions to cfp AT apachecon DOT com. 

 Key Dates:

 3 March 2011 - CFP Opens 
 29 April 2011 - CFP Closes 
 20 May-30 June 2011 - Speaker Notifications and Confirmations 
 7-11 November 2011 - ApacheCon NA 2011 

 We look forward to seeing you in Vancouver! 

 – The ApacheCon Planning team 

 -
 To unsubscribe, e-mail: announce-unsubscr...@apachecon.com
 For additional commands, e-mail: announce-h...@apachecon.com

Fwd: [Announce] Now Open: Call for Participation for ApacheCon North America

2011-03-03 Thread Grant Ingersoll

Begin forwarded message:

 From: Grant Ingersoll grant.ingers...@gmail.com
 Date: March 3, 2011 3:52:05 PM EST
 To: u...@mahout.apache.org, solr-user@lucene.apache.org, 
 java-u...@lucene.apache.org, opennlp-u...@incubator.apache.org
 Subject: Fwd: [Announce] Now Open: Call for Participation for ApacheCon North 
 America

 Begin forwarded message:

 From: Sally Khudairi s...@apache.org
 Date: March 3, 2011 3:10:17 PM EST
 To: annou...@apachecon.com
 Subject: [Announce] Now Open: Call for Participation for ApacheCon North 
 America
 Reply-To: s...@apache.org

 Call for Participation 
 ApacheCon North America 2011 
 7-11 November 2011 
 Westin Bayshore, Vancouver, Canada 

 All submissions must be received by Friday, 29 April 2011 at midnight 
 Pacific Time. 

 ApacheCon, the official conference, trainings, and expo of The Apache 
 Software Foundation (ASF), heads to Vancouver, Canada, this November, with 
 dozens of technical, business, and community-focused sessions for beginner, 
 intermediate, and expert audiences. 

 Now in its 11th year, the ASF develops and shepherds nearly 150 Top-Level 
 Projects and new initiatives in the Apache Incubator and Labs. With hundreds 
 of thousands of applications deploying ASF products and code contributions 
 by more than 2,500 Committers from around the world, the Apache community is 
 recognized as among the most robust, successful, and respected in Open 
 Source. 

 This year's ApacheCon focuses on highly-relevant, professionally-directed 
 presentations that demonstrate specific problems and real-world solutions. 
 We welcome proposals --from developers and users alike-- in the areas of 
 Apache and ...: 

 ... Enterprise Solutions (from ActiveMQ to Axis2 to ServiceMix, OFBiz to 
 Chemistry, the gang's all here!) 

 ... Cloud Computing (Hadoop, Cassandra, HBase, CouchDB, and friends) 

 ... Emerging Technologies + Innovation (Incubating projects such as 
 Libcloud, Stonehenge, and Wookie) 

 ... Community Leadership (mentoring and meritocracy, GSoC and related 
 initiatives) 

 ... Data Handling, Search + Analytics (Lucene, Solr, Mahout, OODT, Hive and 
 friends) 

 ... Pervasive Computing (Felix/OSGi, Tomcat, MyFaces Trinidad, and friends) 

 ... Servers, Infrastructure + Tools (HTTP Server, SpamAssassin, Geronimo, 
 Sling, Wicket and friends) 

 Submissions are open to anyone with relevant expertise: ASF affiliation is 
 not required to present at, attend, or otherwise participate in ApacheCon. 

 Whilst we encourage submissions that the highlight the use of specific 
 Apache solutions, we are unable to accept marketing/commercially-oriented 
 presentations. 

 Other proposals, such as panels, have been considered in the past; you are 
 welcome to submit an alternate presentation, however, such sessions are 
 accepted under exceptional circumstances. Please be as descriptive as 
 possible, including names/bios of proposed panelists and any related 
 details. 

 Accepted speakers (not co-presenters) qualify for general conference 
 admission and a minimum of two nights lodging at the conference hotel. 
 Additional hotel nights and travel assistance are possible, depending on the 
 number of presentations given and type of assistance needed. 

 To submit a presentation proposal, please complete our ONLINE SUBMISSION 
 FORM at http://na11.apachecon.com/proposals/new 

 To be considered, proposals must be received by Friday, 29 April 2011 at 
 midnight Pacific Time. Please email any questions regarding proposal 
 submissions to cfp AT apachecon DOT com. 

 Key Dates:

 3 March 2011 - CFP Opens 
 29 April 2011 - CFP Closes 
 20 May-30 June 2011 - Speaker Notifications and Confirmations 
 7-11 November 2011 - ApacheCon NA 2011 

 We look forward to seeing you in Vancouver! 

 – The ApacheCon Planning team 

 -
 To unsubscribe, e-mail: announce-unsubscr...@apachecon.com
 For additional commands, e-mail: announce-h...@apachecon.com

--
Grant Ingersoll
http://www.lucidimagination.com

Re: Limiting on dates in Solr

2011-03-03 Thread Steve Lewis

Ugh. Of course. I fixed that a couple weeks ago, something must have crept back 
in!
Thanks a mil!

From: Andreas Kemkes a5s...@yahoo.com
To: solr-user@lucene.apache.org
Sent: Thu, March 3, 2011 4:12:02 PM
Subject: Re: Limiting on dates in Solr

2011-03-03T59:59:99.999Z - shouldn't that be 2011-03-03T23:59:59.999Z

From: Steve Lewis spiritualmecha...@yahoo.com
To: solr-user@lucene.apache.org
Sent: Thu, March 3, 2011 11:21:53 AM
Subject: Limiting on dates in Solr

I am treating Solr as a NoSQL db that has great search capabilities. I am 
querying on a few fields:

1. text (default)
2. type (my own string field)
3. calibration (my own date field)

I'd like to limit the results to only show the calibration using this query:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]

This mostly works, but a couple of different dates (March 5) seep into the 
March 

3rd results. Is there any way to exclude the other dates, or at least have them 
return a lower ranking in the search? I've also tried:

calibration:[2011-03-03T00:00:00.000Z TO 2011-03-03T59:59:99.999Z]  AND NOT ( 
calibration:[* TO 2011-03-03T00:00:00.000Z] OR 
calibration:[2011-03-03T59:59:99.999Z TO *])

Which I found suggested on the stackoverflow web site. I've googled a good bit 
and nothing seems to be jumping out at me. No one else appears to be trying to 
do something similar, so I may just have unrealistic expectations of what a 
search engine will do.

Thanks in advance!
Steve

Out of memory while creating indexes

2011-03-03 Thread Solr User

Hi All,

I am trying to create indexes out of a 400MB XML file using the following
command and I am running into out of memory exception.

$JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
SOLR_PORT/solr/customercarecore/update -jar
$SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
$SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml

I am planning to bump up the memory and try again.

Did any one ran into similar issue? Any inputs would be very helpful to
resolve the out of memory exception.

I was able to create indexes with small file but not with large file. I am
not using Solr J.

Thanks,
Solr User

Max Document Size

2011-03-03 Thread Sean Todd

Is there a maximum document size that Solr can handle?  I'm trying to index
documents greater than 15MB, but every time I do I get a random error.  One
of the other problems with what I'm documenting is that they are not in a
human language.  They are EDI documents (EDI is a B2B communication system
that is similar in format to iCal formatted documents) and don't have many
traditional word breaks but do have segment and element character breaks.  I
tried playing with the maxFieldLength parameter, but that doesn't seem to be
helping (and, yes, I changed it in both places in the SolrConfig.xml).

Has anyone had any similar problems with Solr?
*
Sean Todd*
Senior Software Developer
EDI Technical Operations
Build.com, Inc.  http://corp.build.com/
Smarter Home Improvement™
P.O. Box 7990 Chico, CA 95927
*P*: 800.375.3403 x534
*F*: 530.566.1893
st...@build.com | Network of
Storeshttp://www.build.com/index.cfm?page=help:networkstoressource=emailSignature

Re: Out of memory while creating indexes

On Fri, Mar 4, 2011 at 3:32 AM, Solr User solr...@gmail.com wrote:
 Hi All,

 I am trying to create indexes out of a 400MB XML file using the following
 command and I am running into out of memory exception.

Is this a single record in the XML file? If it is more than one, breaking
it up into separate XML files, say one per record, should help.

 $JAVA_HOME/bin/java -Xms768m -Xmx1024m -*Durl*=http://$SOLR_HOST
 SOLR_PORT/solr/customercarecore/update -jar
 $SOLRBASEDIR/*dataconvertor*/common/lib/post.jar
 $SOLRBASEDIR/dataconvertor/customercare/xml/CustomerData.xml

 I am planning to bump up the memory and try again.
[...]

If you give Solr enough memory this should work, but IMHO, it would
be better to break up your input XML files if you can.

Regards,
Gora

Model foreign key type of search?

2011-03-03 Thread Alex Dong

Hi there,  I need some advice on how to implement this using solr:

We have two tables: urls and bookmarks.
- Each url has four fields:  {guid, title, text, url}
- One url will have one or more bookmarks associated with it. Each bookmark
has these: {link.guid, user, tags, comment}

I'd like to return matched urls based on not only the title, text from the
url schema, but also some kind of aggregated popularity score based on all
bookmarks for the same url. The popularity score should base on
number/frequency of bookmarks that match the query.

For example, a search for Paris.  Let's say 15 out of 1000 people has
bookmarked a tripadvisor.com page with Paris in tag or comments field;
 another 15 out of 20 people bookmarked
www.ratp.info/orienter/cv/carteparis.php with Paris in it.  I'd like to rank
the later one, ie the metro planner higher.

I am thinking of implementing org.apache.solr.search.ValueSourceParser which
takes a guid and run a embedded query to get a score for this guid in the
bookmark schema. This would probably requires two separated indexes to begin
with.

Keen to hear ideas on what's the best way to implement this and where I
should start.

Thanks,
Alex

Re: SolrJ Tutorial

2011-03-03 Thread Grijesh

It comes with every solr source code download directory under 

src/test

-
Thanx:
Grijesh
http://lucidimagination.com
--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrJ-Tutorial-tp2307113p2631223.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Model foreign key type of search?