Hi Solr Users,
i have set up a Solr-Server with a custom Schema.
Now i have updated the index with some content form
xml-files.
Now i try to update the contents of a folder.
The folder consits of various document-types
(pdf,doc,xls,...).
Is there anywhere an howto how can i parse the
For other Solr instances (whether embedded or not) to refresh their
index searchers, send a commit/ message to them.
Erik
On Aug 21, 2007, at 7:33 AM, sinking wrote:
Hello,
I have tried to use the EmbeddedSolr (http://wiki.apache.org/solr/
EmbeddedSolr) because i want to work
On Tue, 2007-08-21 at 11:52 +0200, Ard Schrijvers wrote:
you're missing the key piece that Ard alluded to ... the
there is one
ordere list of all terms stored in the index ... a TermEnum lets you
iterate over this ordered list, and the
IndexReader.terms(Term) method
lets you
Installing the patch requires downloading the latest solr via
subversion and applying the patch to the source. Eric has updated his
patch with various revisions of subversion. To make sure it will
compile I suggest getting the revision he lists.
As for using the features of this patch. This is
Hi !
Is there a way to use a MMapDirectory instead of FSDirectory within Solr ?
Our index is quite big and It takes a long time to go up in the OS
cached memory. I'm wondering if an MMapDirectory could help to have
our data in memory quicker (our index on disk is bigger than our
memory
I am a little confused how you have things setup, so these meta data
files contain certain information and there may or may not be a pdf,
xls, doc that it is associated with?
If that is the case, if it were me I would write something to parse
the meta data files, and if there is a binary file
Hi all,
I'm wondering what's the best way to completely change a big index
without loosing any requests.
That's how I do at the moment:
solr index is a soft link to a directory dir.
When I want to install a new index (in dir.new), I do a
mv dir dir.old ; mv dir.new dir
Then I ask for a
I guess the first question is why you have to swap in a big index, instead of
rsyc'ng or another method. I've entertained the idea of putting a load
balancer in front of two solr instances. In this scenario take one off-line
swap in the index, bring it back on and then bring down the other.
I've seen even longer commit times with our 2GB index and have not had a
chance to look into it deeper. What I have noticed is when there are
Searchers registered commits take a lot longer time. Perhaps looking at
the optional attributes for commit (waitSearcher, waitFlush) would help.
Since we
: The conclusion is that setting URIEncoding=UTF-8 in the Connector
: section in server.xml is not enough
:
: I also needed to add -Dfile.encoding=UTF-8 to the tomcatâs java
: startup options (in catalina.bat)
seeing how you resolved this problem, has got me thinking ... how did you
index the
: How long should a commit take? I've got about 9.8G of data for 9M of
: records. (Yes, I'm indexing too much data.) My commits are taking 20-30
the low levels of updating aren't my forte, but as i recall the dominant
factor in how long it takes to execute a commit is the number of deleted
On 8/21/07, Peter Manis [EMAIL PROTECTED] wrote:
I am a little confused how you have things setup, so these meta data
files contain certain information and there may or may not be a pdf,
xls, doc that it is associated with?
Yes, you have it right.
If that is the case, if it were me I would
On 8/21/07, Vish D. [EMAIL PROTECTED] wrote:
On 8/21/07, Peter Manis [EMAIL PROTECTED] wrote:
I am a little confused how you have things setup, so these meta data
files contain certain information and there may or may not be a pdf,
xls, doc that it is associated with?
Yes, you have it
I cant find the documentation, but I believe apache's max url is 8192,
so I would assume a lot of other apps like tomcat and jetty would be
similar. I havn't run into any problems yet.
Maybe shoot Eric an email and see if he would be interested in
adapting the code to take XML as well so that
Trying the query approach with a 3GB indexing takes over a minute to
clear the index.
The reason why to not stop the servlet container and delete the files
manually is that in a particular environment the person testing may not
have access to the filesystem directly. Usually you want to do
It might be worthwhile to have a hibernate mode for solr, where suspend
waits until all requests are finished, then closes all files and rejects all
new requests. Later a wakeup command would bring it back online. During
this time, a remotely controlled job could remove the data directory. This
On 21/08/07, Pierre-Yves LANDRON [EMAIL PROTECTED] wrote:
It seems the highlights fields must be specified, and that I can't use the
* completion to do so.
Am I true ? Is there a way to go throught this obligation ?
As far as I know, dynamic fields are used mainly at during indexing and
Recently someone mentioned that it would be possible to have a 'replace
existing document' feature rather than just dropping and adding documents
with the same unique id. We have a few use cases in this area and I'm
researching whether it is effective to check for a document via Solr
queries, or
: chance to look into it deeper. What I have noticed is when there are
: Searchers registered commits take a lot longer time. Perhaps looking at
that's probably the warming time taken to reopen the new searcher ...
waitSearcher=false should cause those commits to reutrn much faster (the
down
: I'm just seeing if there's an easy/performant way of doing it with Solr.
: For a solution with raw Lucene, creating a new index with the same
: directory cleared out an old index (even on Windows with it's file
: locking) quickly.
there has been talk of optimizing delete by query in the case
20 matches
Mail list logo