could I suggest that the maven repositories are populated next-time a
release of solr-specific-lucenes are made?
But they are? It is inside the org.apache.solr group since those
lucene jars
are released by Solr -- http://repo2.maven.org/maven2/org/apache/solr/
Nope,
Hello,
When I send an update or a commit to solr via curl, the response I get is
formated in HTML ; I can't find a way to have a machine readable response file.
Here what is said on the subject in the solr config file :
The response format differs from solr1.1 formatting and returns a standard
On Wed, Mar 25, 2009 at 12:30 PM, Paul Libbrecht p...@activemath.orgwrote:
could I suggest that the maven repositories are populated next-time a
release of solr-specific-lucenes are made?
But they are? It is inside the org.apache.solr group since those lucene
jars
are released by Solr --
Hello, I'm a happy Solr user. Thanks for the excellent software!!
Hopefully this is a good question, I have indeed looked around the FAQ
and google and such first.
I have just switched from Firefox to Opera for web browsing. (Another story)
When I use the solr/admin the home page and stats
Similar to getting range facets for date where we specify start, end and gap.
Can we do the same thing for numeric facets where we specify start, end and
gap.
--
View this message in context:
http://www.nabble.com/numeric-range-facets-tp22698330p22698330.html
Sent from the Solr - User mailing
On Wed, Mar 25, 2009 at 7:30 AM, Ashish P ashish.ping...@gmail.com wrote:
Can I get all the facets in QueryResponse??
You can get all the facets that are returned by the server. Set facet.limit
to the number of facets you want to retrieve.
See
On Wed, Mar 25, 2009 at 3:26 PM, Ashish P ashish.ping...@gmail.com wrote:
Similar to getting range facets for date where we specify start, end and
gap.
Can we do the same thing for numeric facets where we specify start, end and
gap.
No. But you can do this with multiple queries by using
On Wed, Mar 25, 2009 at 1:33 PM, ristretto.rb ristretto...@gmail.comwrote:
Hello, I'm a happy Solr user. Thanks for the excellent software!!
Hopefully this is a good question, I have indeed looked around the FAQ
and google and such first.
I have just switched from Firefox to Opera for web
On Wed, Mar 25, 2009 at 12:42 PM, Pierre-Yves LANDRON
pland...@hotmail.comwrote:
Hello,
When I send an update or a commit to solr via curl, the response I get is
formated in HTML ; I can't find a way to have a machine readable response
file.
Here what is said on the subject in the solr
I'm trying to delete documents based on the following type of update
requests:
deletequerytopologyid:3140/queryquerytopologyid:3142/query/delete
This doesn't cause any changes on index and if I try to read the response,
the following error ocurs:
13:32:35,196 ERROR [STDERR] 25/Mar/2009 13:32:35
Hi,
Issue 1:
I have 2 solr instances, i need to copy indexes from solr1 instance to solr2
without restarting the solr.
Please suggest how will this work. Both solr are on multicore setup.
Issue2:
I deleted all indexes from solr and reloaded my core, solr admin return 0
results.
The size of
hi,
I'm having difficulty indexing a collection of documents in a reasonable
time.
it's now going at 20 docs / sec on a c1.xlarge instance of amazon ec2 which
just isnt enough.
This box has 8GB ram and the equivalent of 20 xeon processors.
these document have a couple of stored, indexed,
Britske,
Here are a few quick ones:
- Does that machine really have 10 CPU cores? If it has significantly less,
you may be beyond the indexing sweet spot in terms of indexer threads vs. CPU
cores
- Your maxBufferedDocs is super small. Comment that out anyway. use
ramBufferedSizeMB and
Prerna,
You could create an index snapshot with snapshooter script and then copy the
index. You should do that while the source index is not getting modified.
Re issue #2: run optimize
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From:
Hm, I can't quite tell from here, but that is just a warning, so it's not super
problematic at this point.
Could it be that one of your other caches (query cache) is large and lots of
items are copied on searcher flip?
Could it be that your JVM doesn't have large or free enough enough heap?
Hm, where does that /solr2 come from?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: mitulpatel mitulpa...@greymatterindia.com
To: solr-user@lucene.apache.org
Sent: Wednesday, March 25, 2009 12:30:11 AM
Subject: Re: Not able to
Ah, it's hard to tell. I look at index size on disk, number of docs, query
rate, types of queries, etc.
Are you actually seeing problems with your existing servers? Or see specific
performance movement in one of the aspects? (e.g. increasing latency, increased
GC or memory usage, increased
I don't understand why this sometimes takes two minutes between the
start
commit /update and sometimes takes 20 minutes? One of our caches
has about
~40,000 items, but I can't imagine it taking 20 minutes to autowarm a
searcher.
What do your cache configs look like?
How big is the
Yes, I guess I'm running 40k queries when it starts :) I didn't know that
each count was equal to a query. I thought it was just copying the cache
entries from the previous searcher, but I guess that wouldn't include new
entries. I set it to the size of our filterCache. What should I set the the
Hello,
We've encountered a strange issue in our Solr install regarding a particular
string that just doesn't seem to want to return results, despite the exact
same string being in the index.
What makes it even stranger is that we had the same data in a previous
install of Solr, and it worked
Thanks for the quick reply.
the box has 8 real cpu's. Perhaps a good idea then to reduce the nr of cores
to 8 as well. I'm testing out a different scenario with multiple boxes as
well, where clients persist docs to multiple cores on multiple boxes. (which
is what multicore was invented for after
Hi,
Take the whole string to your Solr Admin - Analysis page and analyze it. Does
it get analyzed the way you'd expect it to be analyzed?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Kurt Nordstrom knordst...@library.unt.edu
To:
Greetings, I am a new subscriber. I'm Curtis Olson and I work for CACI
under contract at the U.S. Department of State, where we deal with
massive quantities of documents, so Solr is ideal for us.
We have a good sized index that we are starting to build up in
development. Some of the filter
It looks like the cache is configured big enough, but the autowarm
count is too big to have good performance.
Try something smaller and see if that fixes both problems. I imagine
even just warming the most recent 100 queries would precache the most
important ones, but try some higher
Hi list,
I've finally settled on Solr, seeing as it has almost everything I
could want out of the box.
My setup is a complicated one. It will serve as the search backend on
Bitbucket.org, a mercurial hosting site. We have literally thousands
of code repositories, as well as users and other data.
Curtis,
Like this?
https://issues.apache.org/jira/browse/SOLR-839
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Olson, Curtis B olso...@state.gov
To: solr-user@lucene.apache.org
Sent: Wednesday, March 25, 2009 12:28:35 PM
Subject:
You could index the user name or ID, and then in your application add
as filter the username as you pass the query back to Solr. Maybe have
a access_type that is Public or Private, and then for public searches
only include the ones that meet the access_type of Public.
Eric
On Mar 25,
On Wed, Mar 25, 2009 at 5:57 PM, Eric Pugh
ep...@opensourceconnections.com wrote:
You could index the user name or ID, and then in your application add as
filter the username as you pass the query back to Solr. Maybe have a
access_type that is Public or Private, and then for public searches
you can even create separated indexes for private or public access if u need
(and place them in separated machines), but i think Eric's suggestion is the
best and easier
On Wed, Mar 25, 2009 at 5:52 PM, Jesper Nøhr jno...@gmail.com wrote:
Hi list,
I've finally settled on Solr, seeing as it
Otis:
Okay, I'm not sure whether I should be including the quotes in the query
when using the analyzer, so I've run it both ways (no quotes on the index
value). I'll try to approximate the final tables returned for each term:
The field is dc_subject in both cases, being of type text
***
i can't see the problem about that. you can manage your users using a DB and
keep there the permissions they could have, and create or erase users
without problems. you just have to manage a working index field for each
user with repositories' ids he can access. or u can create several indexes
and
Otis, that very much looks like what I'm after.
Curtis
-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
Sent: Wednesday, March 25, 2009 12:53 PM
To: solr-user@lucene.apache.org
Subject: Re: REST interface for Query
Curtis,
Like this?
Hi
Some of the getting started link dont work. Can you please enable it?
Hm, I must be missing something, then.
Consider this.
There are three repositories, A and B, C. There are two users, U1 and U2.
Repository A is public, while B and C are private. Only U1 can access
B. No one can access C.
I index this data, such that Is_Private is true for B.
Now, when U2
ok so u can create a table in a DB where you have a row foreach user and a
field with the reps he/she can access. Then you just have to take a look on
the db and include the repository name in the index. so you just have to
control (using query parameters) if the query is done for the right reps
Otis,
Absolutely. Here are the tokenizers and filters for the text fieldtype in
the schema. http://pastebin.com/f2bb249f3
Thanks!
That's what I suspected. Want to paste the relevant tokenizer+filters
sections of your schema? The index-time and query-time analysis has to be
the same or
Which links? Please be as specific as possible.
Erick
On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote:
Hi
Some of the getting started link dont work. Can you please enable it?
Oops my mistake. Sorry for the trouble
On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson erickerick...@gmail.comwrote:
Which links? Please be as specific as possible.
Erick
On Wed, Mar 25, 2009 at 1:20 PM, nga pham nga.p...@gmail.com wrote:
Hi
Some of the getting started link dont
Hello all,
We are experimenting with the ShingleFilter with a very large document set (1
million full-text books). Because the ShingleFilter indexes every word pair as
a token, the number of unique terms increases tremendously. In our experiments
so far the tii and tis files are getting very
http://lucene.apache.org/solr/tutorial.html#Getting+Started
link - lucene QueryParser syntax
is not working
On Wed, Mar 25, 2009 at 10:48 AM, nga pham nga.p...@gmail.com wrote:
Oops my mistake. Sorry for the trouble
On Wed, Mar 25, 2009 at 10:42 AM, Erick Erickson
Hi Jon:
We are running various LinkedIn search systems on Zoie in production.
-John
On Thu, Feb 19, 2009 at 9:11 AM, Jon Baer jonb...@gmail.com wrote:
This part:
The part of Zoie that enables real-time searchability is the fact that
ZoieSystem contains three IndexDataLoader objects:
OK, now I'll turn it over to the folks who actually maintain that site G.
Meanwhile, here's the link to the 2.4.1 query syntax.
http://lucene.apache.org/java/2_4_1/queryparsersyntax.html
Best
Erick
On Wed, Mar 25, 2009 at 2:00 PM, nga pham nga.p...@gmail.com wrote:
Hello,
After running a nightly release from around January of Solr for about 4
weeks without any problems, I'm starting to see OutofMemory errors:
Mar 24, 2009 1:35:36 AM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space
at
OK, we're getting closer. I just have two final questions regarding this then:
1. This would also include all the public repositories, right? If so,
how would such a query look? Some kind of is_public:true AND ...?
2. When a repository is made public, the is_public property in the
Solr index
I think it's the later. I don't think the term interval is exposed anywhere.
If you expose it through the config and provide a patch, I think we can add
this to the core quickly.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From:
Would it not make more sense to wait for the Lucene's IW+IR marriage and other
things happening in core Lucene that will make near-real-time search possible?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: John Wang john.w...@gmail.com
Hello there,
I'm looking for a way to implement SRW/U and a OAI-PMH servers over solr,
similar to what i have found here:
http://marc.info/?l=solr-devm=116405019011211w=2 . Well actually if it is
decoupled (not a plugin) would be ok, if not better =).
I wanted to know if anyone knows if there is
Hi,
I've used Lucene before, but new to Solr. I've gone through the
mailing list, but unable to find any clear idea on how to partition
Solr indexes. Here is what we want,
1) Be able to partition indexes by timestamp - basically partition
per day (create a new index directory every day)
try using db for permission management and when u want to make a rep public
u just have to add it's id or name to everyuser permissions field. i think
you don't need to add any is_public field to index, just an id or name
field in wich the indexed doc is.So you can pre-filter the reps quering the
Yes my database is remote, mysql 5 and i'm using connector/J 5.1.7. My index
has 2 documents. When i try to do lets say 14 updates it takes about 18
sec total. Here's the resulting log of the operation :
2009-03-25 15:53:57 org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO:
I set the autowarm to 2000, which only takes about two minutes and resolves
my issues.
Thanks for your help!
best,
cloude
On Wed, Mar 25, 2009 at 9:34 AM, Ryan McKinley ryan...@gmail.com wrote:
It looks like the cache is configured big enough, but the autowarm count is
too big to have good
I implemented OAI-PMH for solr a few years back for the Massachusetts
library system... it appears not to be running right now, but
check... http://www.digitalcommonwealth.org/
It would be great to get that code revived and live open source
somewhere. As is, it uses a pre 1.3 release
Hi All,
In my project, I have one primary core containing all the basic
information for a product.
Now I need to add additional information which will be searched and displayed
in conjunction with the product results.
My question is - From design and query speed point of - should I
I've a question. Is it safe to use 'localhost' as solr_hostname in
scripts.conf?
--
-Tim
Actually what I meant was if there are 100 indexed fields. So there are 100
facet fields right..
So whenever I create solrQuery, I have to do addFacetField(fieldName)
can I avoid this and just get all facet fields.
Sorry for the confusion.
Thanks again,
Ashish
Shalin Shekhar Mangar wrote:
My question is - From design and query speed point of - should I add
new core to handle the additional data or should I add the data to
the existing core.
Do you ever need to get results from both sets of data in the same
query? If so, putting them in the same index will be faster.
Hi,
Without knowing the details, I'd say keep it in the same index if the
additional information shares some/enough fields with the main product data and
separately if it's sufficiently distinct (this also means 2 queries and manual
merging/joining).
Otis --
Sematext -- http://sematext.com/
Hi,
I'm not sure if anyone will be able to help without more detail. First
suggestion would be to look at Solr with a debugger/profiler to see where
memory is used up.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: smock
Hi Alex , you may be able to use CachedSqlEntityprocessor. you can do
delta-import using full-import
http://wiki.apache.org/solr/DataImportHandlerFaq#fullimportdelta
the inner entity can use a CachedSqlEntityProcessor
On Thu, Mar 26, 2009 at 1:45 AM, AlexxelA alexandre.boudrea...@canoe.ca
Actually solr2 is an application other then default one(example) on which I
have configured my application.
let me explain things more in details:
so my application path is http://localhost:8983/solr2/admin and I would like
to configure it for multi-cores so I have placed solr.xml in config
Hello,
Is there a best way to schedule the DataImportHandler? The idea
being to schedule a delta-import every Sunday morning at 7am or perhaps
every hour without human intervention. Writing a cron job to do this
wouldn't be difficult. I'm just wondering is this a built in feature?
right now a cron job is the only option.
building this into DIH has been a common request?
What do others think about this?
On Thu, Mar 26, 2009 at 10:11 AM, Tricia Williams
williams.tri...@gmail.com wrote:
Hello,
Is there a best way to schedule the DataImportHandler? The idea being to
62 matches
Mail list logo