Good day,
Where can I find some documentation of Solrj? Does it have a wiki page or
something?
I am currently trying it out and I did a simple ping to see if it works.
new CommonsHttpSolrServer( url ).ping();
However, I am getting a "Exception in thread "main"
org.apache.solr.common.SolrExce
Maybe there's a different way, in which path-like values like this are
treated explicitly.
I use a similar approach to Matthew at www.colfes.com, where all pages are
generated from Lucene searches according to filters on a couple of
hierarchical categories ('spaces'), i.e. subject and organisation
Good day,
danc86 of #lucene gave me the answer - I was not storing the fields :-)
Thanks,
Franz
On 8/9/07, Ryan McKinley <[EMAIL PROTECTED]> wrote:
>
> >
> > [QUESTION]
> > What could be the problem? .Or what else can I do to debug this problem?
> >
>
> In general 'luke' is a great tool to f
If we have a field spellcheck_db, and have two lines for it:
... Basically the type without
stemming...
All I want to do is make a pile of words as input to the spellcheck feature.
If I index with this, the spellcheck Analyser class c
On Thu, 9 Aug 2007 15:23:03 -0700
"Lance Norskog" <[EMAIL PROTECTED]> wrote:
> Underlying this all, you have a sneaky network performance problem. Your
> successive posts do not reuse a TCP socket. Obvious: re-opening a new socket
> each post takes time. Not obvious: your server has sockets buildi
It should probably be configurable: (1) return nothing if no match, (2)
substitute with an alternate field, (3) return first sentence or N
number of tokens.
-Sean
Yonik Seeley wrote on 8/9/2007, 5:50 PM:
> On 8/9/07, Benjamin Higgins <[EMAIL PROTECTED]> wrote:
> > Thanks Mike. I didn't thin
The current working directory (Cwd) is the directory from which you started
the Tomcat server and is not dependent on the Solr instance configurations.
So as long as SolrHome is correct for each Solr instance, you shouldn't have
a problem.
cheers,
Piete
On 10/08/07, Jae Joo <[EMAIL PROTECTED]>
On 8/9/07, Lance Norskog <[EMAIL PROTECTED]> wrote:
> I'm adding a field to be the source of the spellcheck database. Since that
> is its only job, it has raw text lower-cased, de-Latin1'd, and
> de-duplicated.
>
> Since it is only for the spellcheck DB, it does not need to keep duplicates.
Dupli
Jython is a Python interpreter implemented in Java. (I have a lot of Python
code.)
Total throughput in the servlet is very sensitive to the total number of
servlet sockets available v.s. the number of CPUs.
The different analysers have very different performance.
You might leave some data in the
On 8/9/07, Thiago Jackiw <[EMAIL PROTECTED]> wrote:
> This may be obvious but I can't get my head straight. Is there a way
> to return a list of matching words that a record got matched against?
Unfortunately no... lucene doesn't provide that capability with
standard queries.
You could do it (slow
Here are the Catalina/localhost/ files
For "example" instance
For ca_companies instance
Urls
http://host:8080/solr/admin --> pointint "example" instance (Problem...)
http://host:8080/solr_ca/admin --> pointing "ca-companies" instance (it
is working)
-Original Message-
From: Ja
http://wiki.apache.org/solr/EmbeddedSolr
Following the example on connecting to the Index directly without using
HTTP, I tried to optimize by passing the true flag to the
CommitUpdateCommand.
When optimizing an index with Lucene directly it doubles the size of the
index temporarily and then del
On 8/9/07, Benjamin Higgins <[EMAIL PROTECTED]> wrote:
> Thanks Mike. I didn't think of creating a blurb beforehand, but that's
> a great solution. I'll probably do that. Yonik, I can still add a JIRA
> issue if you'd like, though.
Always 10 different ways to tackle the same problem in the sear
Hi,
I have built 2 solr instance - one is "example" and the other is
"ca_companies".
The "ca_companies" solr instance is working find, but "example is not
working...
In the admin page, "/solr/admin", for "example" instance, it shows that
Cwd=/rpt/src/apache-solr-1.2.0/ca_companies/s
Hi again,
It'd be nice to know what the starting line number is for highlighted
snippets. I imagine others might find it useful to know the starting
byte offset. Is there an easy way to add this in? I'm not afraid of
hacking the source if it's not too involved.
Thanks.
Ben
Thanks Mike. I didn't think of creating a blurb beforehand, but that's
a great solution. I'll probably do that. Yonik, I can still add a JIRA
issue if you'd like, though.
Ben
-Original Message-
From: Mike Klaas [mailto:[EMAIL PROTECTED]
Sent: Thursday, August 09, 2007 2:32 PM
To: solr
I'm adding a field to be the source of the spellcheck database. Since that
is its only job, it has raw text lower-cased, de-Latin1'd, and
de-duplicated.
Since it is only for the spellcheck DB, it does not need to keep duplicates.
I specified it as 'multiValued="false" and used from a few other
This may be obvious but I can't get my head straight. Is there a way
to return a list of matching words that a record got matched against?
For instance:
record_a: ruby, solr, mysql, rails
record_b: solr, java
Then ?q=solr+OR+rails would return the matched words for the records
record_a: solr, ra
On 9-Aug-07, at 2:10 PM, Benjamin Higgins wrote:
Hi all, I'd like to provide a blurb of documents matching a search in
the case when there is no text highlighted. I assumed that perhaps
the
highlighter would give me back the first few words in a document if
this
occurred, but it doesn't. M
On 8/9/07, Benjamin Higgins <[EMAIL PROTECTED]> wrote:
> Hi all, I'd like to provide a blurb of documents matching a search in
> the case when there is no text highlighted. I assumed that perhaps the
> highlighter would give me back the first few words in a document if this
> occurred, but it does
Hi all, I'd like to provide a blurb of documents matching a search in
the case when there is no text highlighted. I assumed that perhaps the
highlighter would give me back the first few words in a document if this
occurred, but it doesn't. My conundrum is that I'd rather not grab the
whole docume
On 8/9/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
> http://66.209.92.171:8080/solr/select/?q=department_exact:Apparel%
> 3EMen's%20Apparel%
> 3EJackets*&fq=country_code:US&fq=brand_exact:adidas&wt=python
>
> The same exact query, with... wait..
>
> Wow. I'm making myself look like an idiot.
>
> I
http://66.209.92.171:8080/solr/select/?q=department_exact:Apparel%
3EMen's%20Apparel%
3EJackets*&fq=country_code:US&fq=brand_exact:adidas&wt=python
The same exact query, with... wait..
Wow. I'm making myself look like an idiot.
I swear that these queries didn't work the first time I ran them.
On 8/9/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
> Feel free to run some queries yourself. We opened the firewall for
> this box...
>
> http://66.209.92.171:8080/solr/select/?q=department_exact:Apparel%
> 3EMen's\%20Apparel%
> 3EJackets*&fq=country_code:US&fq=brand_exact:adidas&wt=python
OK, so
Feel free to run some queries yourself. We opened the firewall for
this box...
http://66.209.92.171:8080/solr/select/?q=department_exact:Apparel%
3EMen's\%20Apparel%
3EJackets*&fq=country_code:US&fq=brand_exact:adidas&wt=python
++
|
Hm, I don't see any attachments, I'm forwarding them to you directly.
Would anyone else like to see them?
++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
+--
Sure thing!
Heres 1, and 2.
1 - just a space.
2 - a "\ ".
++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
++
On Aug 9, 2007, at 1:14 PM, Yonik Seeley
On 8/9/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> They translate to different queries.
> But can I see the XML output for 1 and 2 with &debugQuery=on&indent=on
> appended?
Or perhaps with wt=python would be less confusing seeing that there
are '>' chars in there that would otherwise be escaped
On 8/9/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
> Yes, we've reindexed several times. Here are three sample result sets..
>
> 1 - ?q=department_exact:Apparel>Men's?
> Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas
> 2 - ?q=department_exact:Apparel>Men's\
> Apparel>Jackets*&fq=country_
Yes, we've reindexed several times. Here are three sample result sets..
1 - ?q=department_exact:Apparel>Men's?
Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas
2 - ?q=department_exact:Apparel>Men's\
Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas
3 - ?q=department_exact:Appa
On 8/9/07, Kevin Holmes <[EMAIL PROTECTED]> wrote:
> Python script queries the mysql DB then calls bash script
>
> Bash script performs a curl POST submit to solr
For the most up-to-date solr client for python, check out
https://issues.apache.org/jira/browse/SOLR-216
-Yonik
Is this a native feature, or do we need to get creative with scp from
one server to the other?
If it's a contention between search and indexing, separate them
via a query-slave and an index-master.
--cw
On 8/9/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
> Here you go.. I thought that "string" wasn't munged, so I used that...
>
>
> stored="true"/>
>
Hmmm, that looks ok. You re-indexed since department_exact was added?
If so, could you show the exact XML response containing a document
with depa
Hmmm, I think you can map an empty (zero length) value to something else via
f.foo.map=:something
But that column does currently need to be there in the CSV.
Specifying default values in a per-request basis is interesting, and
something we could perhaps support in the future.
The quickest way to i
Here you go.. I thought that "string" wasn't munged, so I used that...
stored="true"/>
++
| Matthew Runo
| Zappos Development
| [EMAIL PROTECTED]
| 702-943-7833
++
On Aug 9,
On 8/9/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
> Hmm.. I just tried the following three queries...
>
> /?q=department_exact:Apparel>Men's?
> Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas...
> (no results)
>
> /?q=department_exact:Apparel>Men's\
> Apparel>Jackets*&fq=country_code:US&
On 9-Aug-07, at 7:52 AM, Ard Schrijvers wrote:
ulimit -n 8192
Unless you have an old, creaky box, I highly recommend simply upping
your filedesc cap.
-Mike
Hi -
Just looking at synonyms, and had a couple of questions.
1) For some of my synonyms, it seems to make senses to simply replace the
original word with the other (e.g. "theatre" => "theater", so searches for
either will find either). For others, I want to add an alternate term while
preserving
Hi,
I noticed the first index update after I restart my jboss server always
fail with the exception below. Any update after that works fine. Does
anyone know what the problem is? The solr version I'm using is solr1.2
Thanks
Xuesong
2007-08-09 11:41:44,559 ERROR [STDERR] Aug 9, 2007 11:41:44 AM
On 8/9/07, Siegfried Goeschl <[EMAIL PROTECTED]> wrote:
> +) my colleague just finished a database import service running within
> the servlet container to avoid writing out the data to the file system
> and transmitting it over HTTP.
Most people doing this read data out of the database and constr
Hmm.. I just tried the following three queries...
/?q=department_exact:Apparel>Men's?
Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas...
(no results)
/?q=department_exact:Apparel>Men's\
Apparel>Jackets*&fq=country_code:US&fq=brand_exact:adidas...
(no results)
/?q=Apparel>Men's\
Hi Kevin,
I'm also a newbie but some thoughts along the line ...
+) for evaluating SOLR we used a less exotic setup for data import base
on Pnuts (a JVM based scripting language) ... :-) ... but Groovy would
do as well if you feel at home with Java.
+) my colleague just finished a database i
If you check out the documentation for mergeFactor, you'll find that adjusting
it downward can lower the number of open files. Just remember that it is a
speed tradeoff, and only lower it as much as you need to to stop getting the
"too many files" errors.
See this section:
http://www.onjava.c
On 8/9/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 8/9/07, David Whalen <[EMAIL PROTECTED]> wrote:
> > Plus, I have to believe there's a faster way to get documents
> > into solr/lucene than using curl
Oh yeah, and by "curl" I assume you meant HTTP in general. You
certainly don't want to
On 8/9/07, David Whalen <[EMAIL PROTECTED]> wrote:
> Plus, I have to believe there's a faster way to get documents
> into solr/lucene than using curl
One issue with HTTP is latency. You can get around that by adding
multiple documents per request, or by using multiple threads
concurrently.
Y
If it's a contention between search and indexing, separate them
via a query-slave and an index-master.
--cw
On 8/9/07, David Whalen <[EMAIL PROTECTED]> wrote:
>
> What we're looking for is a way to inject *without* using
> curl, or wget, or any other http-based communication. We'd
> like for th
On Aug 9, 2007, at 11:12 AM, Kevin Holmes wrote:
2: Is there a way to inject into solr without using POST / curl /
http?
Check http://wiki.apache.org/solr/EmbeddedSolr
There's examples in java and cocoa to use the DirectSolrConnection
class, querying and updating solr w/o a web serve
(re)building the index separately (ie. on a different computer) and then
replacing the active index may be an option.
David Whalen wrote:
What we're looking for is a way to inject *without* using
curl, or wget, or any other http-based communication. We'd
like for the HTTP daemon to only handle
What we're looking for is a way to inject *without* using
curl, or wget, or any other http-based communication. We'd
like for the HTTP daemon to only handle search requests, not
indexing requests on top of them.
Plus, I have to believe there's a faster way to get documents
into solr/lucene than u
Condensing the loader into a single executable sounds right if
you have performance problems. ;-)
You could also try adding multiple s in a single post if you
notice your problems are with tcp setup time, though if you're
doing localhost connections that should be minimal.
If you're already local
I inherited an existing (working) solr indexing script that runs like
this:
Python script queries the mysql DB then calls bash script
Bash script performs a curl POST submit to solr
We're injecting about 1000 records / minute (constantly), frequently
pushing the edge of our CPU / RAM limit
Hello,
useCompoundFile set to true, should avoid the problem. You could also try to
set maximum open files higher, something like (I assume linux)
ulimit -n 8192
Ard
>
> You're a gentleman and a scholar. I will donate the M&Ms to
> myself :).
> Can you tell me from this snippet of my solrc
You're a gentleman and a scholar. I will donate the M&Ms to myself :).
Can you tell me from this snippet of my solrconfig.xml what I might
tweak to make this more betterer?
-KH
false
10
1000
2147483647
1
1000
1
Hi!
say I have 300 csv files that I need to index.
Each one holds millions of lines (each line is a few fields separated by
commas)
Each csv file represents a different domain of data (e,g, file1 is
computers, file2 is flowers, etc)
There is no indication of the domain ID in the data insid
You could try committing updates more frequently, or maybe optimising the
index beforehand (and even during!). I imagine you could also change the
Solr config, if you have access to it, to tweak indexing (or index creation)
parameters - http://wiki.apache.org/solr/SolrConfigXml should be of use to
java.io.FileNotFoundException:
/usr/local/bin/apache-solr/enr/solr/data/index/_16ik.tii (Too many open
files)
When I'm importing, this is the error I get. I know it's vague and
obscure. Can someone suggest where to start? I'll buy a bag of M&Ms
(not peanut) for anyone who can help me solve t
I just saw an e-mail from Yonik suggesting escaping the space. I know
so little about Solr that all I can do is parrot Yonik...
Erick
On 8/8/07, Matthew Runo <[EMAIL PROTECTED]> wrote:
>
> OK.
>
> So a followup question..
>
> ?q=department_exact:Apparel%3EMen's%
> 20Apparel*&fq=country_code:US&fq
Hello I'm exactly in the same situation as you. I've got some structured
subject ( as subjects:main subject/sub subject/sub sub subject ) and want to
search them as litteral from a given level (subjects:main subject/*). As you
know subjects:"main subject/"* doesn't work (but it should, shouldn't
That worked. I had to get the schema, get the the FieldType, also get
the Fieldable object from the document, then use
fieldType.toExternal(fieldable).toString() but it ultimately worked!
Thanks for your help, appreciate it.
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECT
59 matches
Mail list logo