Hi,
I've looked over the public Solr perf docs and done some searching on
this mailing list. Still, I'd like to seek some advice based on my
specific situation:
- 2-3 million documents / 5GB index
- each document has 40+ indexed fields, and many multivalue fields
- only primary keys are "stored"
Hi,
I have to take backup of the indexes, which are there in my solr server. I
know that we will have to give the target path in scripts.conf file, but I
want to know below things.
1. What is the userId to be given in scripts.conf file.
2. How & where to run the scripts given in bin folder under
replication must work fine w/o applying any patch everything is committed
On Thu, Mar 26, 2009 at 6:42 PM, sunnyfr wrote:
>
> Hi,
>
> Since I put this functionnality on, on my servers it takes sometimes a long
> time to get a respond for a select
> sometimes Qtime = 4sec some other 200msec ?
>
>
it runs the 'deletedPkQuery' and the resultset is used to delete the docs.
what specifically is your doubt?
On Thu, Mar 26, 2009 at 10:25 PM, Rui Pereira wrote:
> I can't find that much information about the handling of deleted rows by DIH
> in delta-imports, can you show me some examples?
> Tha
this fix is there in the trunk ,
you may not need to apply the patch
On Fri, Mar 27, 2009 at 6:02 AM, sunnyfr wrote:
>
> Hi,
>
> It doesn't seem to work for me, I changed as well this part below is it ok??
>> - List copiedfiles = new ArrayList();
>> + Set filesToCopy = new HashSet();
>
> ht
Hi all,
I'm having an issue with the order of my results when attempting to sort by
a function in my query. Looking at the debug output of the query, the score
returned with in the result section for any given document does not match
the score in the debug output. It turns out that if I optimiz
Tom,
Aha, so you are using a single server for index updates, deleted, and searches.
This is OK for small setups and in itself is not the source of this slowness.
The problem is likely caused by you swapping searchers after each index
update/delete, and probably without warming up the new se
Hi,
It doesn't seem to work for me, I changed as well this part below is it ok??
> -List copiedfiles = new ArrayList();
> +Set filesToCopy = new HashSet();
http://www.nabble.com/file/p22734005/ReplicationHandler.java
ReplicationHandler.java
Thanks a lot,
Noble Paul നോബിള് नोब्ळ्
I'm developing a site (currently single server) that uses localsolr to
perform geo searches on ~200,000 small records although I'm expecting this
to grow significantly once I go live.
So far, so good but I've noticed that after any updates / deletions to the
index the first query is then very slo
Hi,
1) There is no need for Lucene at all. That "indexer" is whatever object you
use to send your 10K docs to Solr. Presumably each Solr instance you end up
creating will have its own "indexer" object in your application.
2)
http://wiki.apache.org/solr/CoreAdmin#head-7ca1b98a9df8b8ca0dcfbfc
XML is getting eaten by my mail client (Yahoo mail) when I hit reply. Lame.
But your config:
dismax
explicit
is missing qf and other parameters. Which fields is your DisMax supposed
to query? It doesn't know, they are not in the config above.
Otis
--
Sematext
Thanks again Otis. Few more questions,
1) My app currently is a stand-alone java app (not part of Solr JVM)
that simply calls update webservice on Solr (running in a separate web
container) passing 10k documents at once. In your example you
mentioned getting list of Indexers and adding document t
Did the XML in that message come through okay? Gmail seems to be
eating it on my end.
Anyway, while the default config has those fields, it also fails with
the application config, which has:
dismax
explicit
Since this essentially the same as standard, I assumed it would
Standard searches your default field (specified in schema.xml).
DisMax searches fields you specify in DisMax config.
Yours has:
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
But there are not your real fields. Change that to your real fields in qf, pf
and other parts of DisM
I do not have a qf set; this is the query generated by the admin interface:
dismax:
select?indent=on&version=2.2&q=test&start=0&rows=10&fl=*%2Cscore&qt=dismax&wt=standard&explainOther=&hl.fl=
standard:
select?indent=on&version=2.2&q=test&start=0&rows=10&fl=*%2Cscore&qt=standard&wt=standard&explain
Hi,
Here in DERI [1], we are working on an extension for Lucene / Solr to
handle RDF data and structured queries. The engine is currently in use
in the Sindice [2] search engine. We are planning to release our
extension, called SIREn (for Semantic Information Retrieval Engine), as
open source
I have a very similar setup and that's precisely what we do - except
with JSON.
1) Request comes into PHP
2) PHP runs the search against several different cores (in a multicore
setup) - ours are a little more than "slightly" different
3) PHP constructs a new object with the responseHeader a
Do you have qf set? Just last week I had a problem where no results were
coming back, and it turned out that my qf param was empty.
Matt
On Thu, Mar 26, 2009 at 2:30 PM, Ben Lavender wrote:
> Hello,
>
> I'm using the March 18th 1.4 nightly, and I can't get a dismax query
> to return results. T
Hi,
Here's some info that might be helpful:
- URL you are accessing
- Response you are getting if any
- Any errors mentioned in the logs
- Your dismax config section from solrconfig.xml
- Your fields from schema.xml used in DisMax config in solrconfig.xml
Otis
--
Sematext -- http://sematext.com
: Data "A", "B", and "C" are slightly different, thus they are indexed
: differently; obviously the client receives the search results for all data
: types in a consistent/common format. The client application shall be able to
: search among each or all data types ("A", "B", "C"). The order will b
Hello,
I'm using the March 18th 1.4 nightly, and I can't get a dismax query
to return results. The standard and partitioned query types return
data fine. I'm using jetty, and the problem occurs with the default
solrconfig.xml as well as the one I am using, which is the Drupal
module, beta 6. Th
Hey, all!
I'm planning a project where I want to write software that takes an RDF class
and uses that information to dynamically support indexing and faceted searching
of resources of that type. This would (as I imagine it) function with dynamic
fields in all required data types and multiplicit
: > What I want, however, is an accurate description of the error and not just
: > a standard Apache error code.
: > Is there a way to obtain an XML response file from solr ?
: If the update command executes successfully, then the response is XML. In
: case of error, the error page is generated b
I found what was the prob. I was using a mysql view and it seems it don't
take in consideration the index i had on the last_modified field from the
original table ><. Mysql calls were taking 1 sec each :|
I just switch back to a request with join instead of a request to my view.
Now doing aroun
Hi,
1) Look for "multicore" on Solr Wiki
2) I meant to say you would not index it all in one index (that's what you
wanted to do, no?). So in your app you'd do something like
ts = doc.getTimestamp();
indexer = getIndexer(ts); // gives you different indexer based on the ts. You
keep track of
Changed the config so that both WordDelimiterFilterFactory settings on both
index and query use:
org.apache.solr.analysis.WordDelimiterFilterFactory {generateNumberParts=1,
catenateWords=1, generateWordParts=1, catenateAll=0, catenateNumbers=1}
Restarted Solr, reindexed the records.
Unfortunat
Hi,
After installing that patch, all is running fine for me as well - problem no
longer occurring and replication running great! The issue
https://issues.apache.org/jira/browse/SOLR-978 has already been committed,
so it's also there in the 1.4 nightly builds.
Bye,
Jaco.
2009/3/26 sunnyfr
>
On Thu, Mar 26, 2009 at 10:58 AM, Otis Gospodnetic
wrote:
> Yes, UnInvertedField uses OpenBitSet.
Right, for those terms that match a large percent of the documents.
But filtering (fq params) also takes up space, so you don't want the
filterCache too large.
Look at the stats page in solr admin...
Hi,
I have documents where text from two languages, e.g. (english & korean) or
(english & german) are mixed u p in a fairly intensive way. 20-30% of the
text is in English and the rest in the other. Can somebody indicate how I
should set up the 'analyzers' and 'fields' in schema.xml? Should I hav
Thanks Otis for the response. I'm still not clear on few things,
1) I thought Solr can work with only one index at a time. In order to
have multiple indexes you need multiple instances of Solr - isn't that
right? How can we make Solr to read/ write from and to multiple
indexes?
2) What does it me
I can't find that much information about the handling of deleted rows by DIH
in delta-imports, can you show me some examples?
Thanks in advance,
Rui Pereira
Nga,
It doesn't out of the box, but it could. I think you could achieve this with
either a custom XSLT that transforms the typical XML response into a new
format, or by writing a completely custom Response writer.
See:
http://wiki.apache.org/solr/QueryResponseWriter
http://wiki.apache.org
Synonym mappings are an easy way to handle specific cases like these...
C++ => cplusplus
C# => csharp
-Yonik
http://www.lucidimagination.com
On Thu, Mar 26, 2009 at 9:27 AM, Jana, Kumar Raja wrote:
> Hi Leonardo,
> 1. U can change the fieldtype to "string" in which case no tokenizers
> will act
Hello,
I'm having a hard time getting a multi-core Solr instance caches to show up on
Stats/Cache Admin page. This works fine with non-multicore Solr instances, of
course. This is with Solr 1.4-dev 753608 (from Match 14th). Here are the
details:
- Solr home is at /data/solr_home
- Jetty i
You could probably create a type field in the index to indicate the
task type. And then use the task type plus the primary key from the
db to create the Id within the index. Would save you alot of on
maintenance, and has a bunch benefits.
-John
On Mar 26, 2009, at 8:23 AM, "Radha C." wr
Hi All,
Can Solr export into comma delimited files?
thank you,
Nga
Harish,
Yes, UnInvertedField uses OpenBitSet.
As for the profiler, YouKit is a good one - http://www.youkit.com/
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: smock
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 26, 2009 9:58
Hi,
You should be able to access http://./solr2
There, you should see all your cores and clicking on them should take you to
/solr2/CoreNameHere/admin
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: mitulpatel
> To: solr-user@lucene.
Kurt,
Attributes for WordDelimiterFilterFactory have different values in the "index"
vs. "query" sections. Do things work if you make them identical? (you'll have
to reindex)
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Kurt Nordstro
Tim,
If localhost doesn't work for some reason, you can always use 127.0.0.1 . That
"localhost" is typically defined in /etc/hosts .
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Garafola Timothy
> To: solr-user@lucene.apache.org
> S
Was writing the email and reading responses really faster than
http://www.google.com/search?q=solr+case+insensitive ? :)
- Original Message
> From: con
> To: solr-user@lucene.apache.org
> Sent: Thursday, March 26, 2009 2:43:44 AM
> Subject: How to avoid case sensitive search?
>
>
>
That should be OK. I did a quick scan of all the scripts that use
$solr_hostname. It defaults to localhost if it is not set.
Bill
On Wed, Mar 25, 2009 at 7:24 PM, Garafola Timothy wrote:
> I've a question. Is it safe to use 'localhost' as solr_hostname in
> scripts.conf?
>
> --
> -Tim
>
Hi Otis,
Thanks for the feedback - I'm pretty new to Java, could you or anyone else
give me some pointers on how to run Solr with a debugger/profiler? It would
be really appreciated.
More generally, is OpenBitSet a utility for UnInvertedFields? Is it
reasonable to expect that this has somethin
Just applied this patch :
http://www.nabble.com/Solr-Replication%3A-disk-space-consumed-on-slave-much-higher-than-on--master-td21579171.html#a21622876
It seems to work well now. Do I have to do something else ?
Do you reckon something for my configuration ?
Thanks a lot
--
View this message in
Hi Leonardo,
1. U can change the fieldtype to "string" in which case no tokenizers
will act on ur data and the content will be stored as is.
2. If u are using Solr 1.4 (latest) then there is a provision to mention
protected words for WordDelimiterFilterFactory which will take care of
your issue.
Hello there!
Currently we're having a problem in here and we're looking for some
solutions. Right now we use the Standard Tokenizer to separate tokens
and we just found out that we cannot search for "c++" in our index
because it is not considered a word.
Since we need this search to work pro
sunnyfr wrote:
>
> Hi,
>
> Since I put this functionnality on, on my servers it takes sometimes a
> long time to get a respond for a select
> sometimes Qtime = 4sec some other 200msec ?
>
> Do you know why? and when I look at my servers graph, users part is very
> used since I've applied this
Hi,
Since I put this functionnality on, on my servers it takes sometimes a long
time to get a respond for a select
sometimes Qtime = 4sec some other 200msec ?
Do you know why? and when I look at my servers graph, users part is very
used since I've applied this two patch.
Thanks for your help.
I
Hi,
Since I put this functionnality on, on my servers it takes sometimes a long
time to get a respond for a select
sometimes Qtime = 4sec some other 200msec ?
Do you know why? and when I look at my servers graph, users part is very
used since I've applied this two patch.
Thanks for your help.
I
Giovanni,
Much Thanks for the reply.
We are having seperate set of tables for each task. So we are going to
provide different search based on the task. The tables of one task are
unrelated to tables of another task.
_
From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com]
Hello,
that might be a solution although it is a maintenance nightmare...
Are all those tables completely unrelated? Meaning does each table produce a
totally different document?
Either or when you perform a search you must return a common document
(unless your client is able to distinguish betw
Thanks for your reply.
If I want to search the my data spread over many tables say more than 50
tables, then I have to setup that many cores ?
_
From: Giovanni De Stefano [mailto:giovanni.destef...@gmail.com]
Sent: Thursday, March 26, 2009 5:04 PM
To: solr-user@lucene.apache.org; cra..
The SqlEntityProcessor does not ignore the error because SQL errors
are usually serious errors.
How come you have a wrong table name?
On Thu, Mar 26, 2009 at 4:15 PM, Rui Pereira wrote:
> Is there a way for DIH not to abort when an entity query is wrong (invalid
> table name or table field), this
I can't make out what the obvious mistake is
BTW why don't you use SolrJ?
http://wiki.apache.org/solr/SolrJ
--Noble
On Thu, Mar 26, 2009 at 3:57 PM, Rui Pereira wrote:
> Here is the code where I make the request:
> Document xmlDocument = this.constructDeleteXml();
>
> try {
>
Hello,
I believe you should use 2 different indexes, 2 different cores and write a
custom request handler or any other client that forwards the query to the
cores and merge the results.
Cheers,
Giovanni
On 3/26/09, Radha C. wrote:
>
> Hi,
>
> I am trying to index different tables with differen
How can i access your oai interface (server) ?
On Wed, Mar 25, 2009 at 9:01 PM, Ryan McKinley wrote:
> I implemented OAI-PMH for solr a few years back for the Massachusetts
> library system... it appears not to be running right now, but check...
> http://www.digitalcommonwealth.org/
>
> It wou
Hi,
I am trying to index different tables with different primary keys and
different fields.
Table A - primary field is a_id
Table B - primary fiedls is b_id
How to specify two different primary keys for two different tables in
schema.xml?
Is it possible to create a data-config with differen
Is there a way for DIH not to abort when an entity query is wrong (invalid
table name or table field), this is, a way to continue with the next entity.
Thanks in advance,
Rui Pereira
If you are saying that the number of private repos for the user is limited
to say less than 10 or something like that, the query wouldn't be very
long...
Something like public:true OR (repo_id:(1 OR 2 OR 3 OR 4) etc.
- Aleks
On Thu, 26 Mar 2009 09:33:14 +0100, Jesper Nøhr wrote:
Ah. Well
Here is the code where I make the request:
Document xmlDocument = this.constructDeleteXml();
try {
URL url = new URL(this.solrPath + "/update");
HttpURLConnection connection = (HttpURLConnection)
url.openConnection();
connection.setDoOut
Hi,
I don't understand how my index folder can pass from 11G to 45G?
Is it a prob with my segment?
For information I'm using solr 1.4, i've 14M of docs. The first full import
or optimize low down size to 11G.
I'm updating data (delta-import) every 30 mn for about 50 000docs updated
every time.
Ah. Well that's what I thought, and that's where I get confused.
Realistically speaking, we have, say, 10.000 public repositories and
any given user may have 2 or 3 private repositories. This means that
when the user searches, he should search among all those 10.000 public
ones, but also his 2 or
H, my tuppence worth!
IMHO I do not think this should be built into solr. Doing it properly
leads to all kinds of nasty platform dependent issues... will we then
want to add notification features on success/failure? via email?
Ideally, all the scheduled activities on a system should be ce
63 matches
Mail list logo