I just copied in the newer .jars and got rid of the old ones and
everything seemed to work smoothly enough.
Liam
On Tue, 2010-02-16 at 13:11 -0500, Grant Ingersoll wrote:
I've got a task open to upgrade to 0.6. Will try to get to it this week.
Upgrading is usually pretty trivial.
On
probably not
if there is no need to embed or programmatically start and stop the server then
Tomcat would be the safe choice, probably easier to get going with to start
with and you'll find a lot more information about it
- Original Message -
From: Steve Radhouani
Thanks a lot Ron!
2010/2/17 Ron Chan rc...@i-tao.com
probably not
if there is no need to embed or programmatically start and stop the server
then Tomcat would be the safe choice, probably easier to get going with to
start with and you'll find a lot more information about it
-
Hello All,
If we have very large index size, how can I back up incrementally. (one full
backup followed by multiple incremental backups).
How do I take compressed backups?
Do I have roll out the backup infrastructure manually? or is there something
pre-built?
--
View this message in
Looking closer at the documentation, it appears that expungeDeletes in fact
has nothing to do with 'removing deleted documents from the index' as i
thought before:
http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22commit.22_and_.22optimize.22
expungeDeletes = true | false
abhishes wrote:
Hello All,
If we have very large index size, how can I back up incrementally. (one full
backup followed by multiple incremental backups).
How do I take compressed backups?
http://rsnapshot.org/
Hi,
I'm having a strange problem when indexing data through our application.
Whenever I post something to the update resource, I get
Unexpected character 'a' (code 97) in prolog; expected '' at [row,col
{unknown-source}]: [1,1], html
head
meta http-equiv=Content-Type content=text/html;
Hi Group,
I need some feedback on solr security.
For Making by solr admin password protected,
I had used the Path Based Authentication form
http://wiki.apache.org/solr/SolrSecurity.
In this way my admin area,search,delete,add to index is protected.But Now
when I make solr authenticated then
hi all! How I can get the frequency for word in index?
--
View this message in context:
http://old.nabble.com/solr-word-frequency-tp27622615p27622615.html
Sent from the Solr - User mailing list archive at Nabble.com.
Using the Schema Browser of the Solr interface or Luke you can get the
frequency of a word in a specific field, but I don't know how to get it in
the entire index. A dirty solution would be to create a new field and copy
in it all your existing fields (copyField source=existingField
dest=newField
Hey there,
I see that when solr gives me back the scores in the response it are the
same for many different documents.
I have build a simple index for testing purposes with just documents with
one field indexed with standard analyzer and containing pices of text.
I have done the same with a self
Vijayant Kumar wrote:
Hi Group,
I need some feedback on solr security.
For Making by solr admin password protected,
I had used the Path Based Authentication form
http://wiki.apache.org/solr/SolrSecurity.
In this way my admin area,search,delete,add to index is protected.But Now
when I make
Schema browser and Luke don't fit! Because I need get frequency for selected
word in my code! In Luke display only first 10 words! I try to change some
configs in solrconfig and in schema but it don't help me! Maybe there are
another way to get frequency for word?
--
View this message in
Hi Xavier,
Thanks for your feedback
the firewall rule for the trusted IP is not fessiable for us because the
application is open for public so we can not work through IP banning.
Vijayant Kumar wrote:
Hi Group,
I need some feedback on solr security.
For Making by solr admin password
in the Schema browser, you can specify the Top X Terms you want to display.
Here's what you have on the browser: *Docs: * xxx
*Distinct: *
Top Terms
Thus, you can get the frequency of a given word, even though it's not the
most elegant solution.
2010/2/17 michaelnazaruk
Vijayant Kumar wrote:
Hi Xavier,
Thanks for your feedback
the firewall rule for the trusted IP is not fessiable for us because the
application is open for public so we can not work through IP banning.
Vijayant Kumar wrote:
Hi Group,
I need some feedback on solr security.
For Making
I found more interesting way:
http://localhost:8983/solr/select?q=bongoterms=trueterms.fl=idterms.prefix=bongoindent=true
in terms.prefix we set the value witch we want to find :)
I hope this example help for another people ...
Thanks for all, who help me :)
--
View this message in context:
The file looks good to me, but as I remember, the xml must
be UTF-8 (but check). Is there a chance that somewhere in
the chain it's not?
HTH
Erick
2010/2/17 Jan Simon Winkelmann winkelm...@newsfactory.de
Hi,
I'm having a strange problem when indexing data through our application.
Whenever I
Xavier Schepler wrote:
Vijayant Kumar wrote:
Hi Xavier,
Thanks for your feedback
the firewall rule for the trusted IP is not fessiable for us because the
application is open for public so we can not work through IP banning.
Vijayant Kumar wrote:
Hi Group,
I need some feedback on solr
OmitNorms=false is probably what you want. Did you re-create your
index for each test?
Also, what does debutQuery=true show?
You could get a copy of Luke (google Lucene Luke) and use that to
examine your index to see how things score, which would give you
some clue whether your index (and
Hi all,
we are facing extremly increasing warmup times the last 15 days, which
we are not able to explain, since the number of documents and their size
is stable. Before the increase we can commit our changes in nearly 20
minutes, now it is about 2 hours.
We were able to identify the warmup of
You could set a firewall that forbid any connection to your Solr's
server port to everyone, except the computer that host your application
that connect to Solr.
So, only your application will be able to connect to Solr.
I believe firewalling is the only possible solution since SOLR doesn't
For Making by solr admin password protected,
I had used the Path Based Authentication form
http://wiki.apache.org/solr/SolrSecurity.
In this way my admin area,search,delete,add to index is protected.But
Now
when I make solr authenticated then for every update/delete from the
fornt
end is
On Wed, 17 Feb 2010 10:13:46 -0400
Fuad Efendi f...@efendi.ca wrote:
You could set a firewall that forbid any connection to your
Solr's server port to everyone, except the computer that host
your application that connect to Solr.
So, only your application will be able to connect to Solr.
On Tue, 2010-02-16 at 10:35 +0100, Tim Terlegård wrote:
I actually tried SSD yesterday. Queries which need to go to disk are
much faster now. I did expect that warmup for sort fields would be
much quicker as well, but that seems to be cpu bound.
That and bulk I/O. The sorter imports the Terms
Thanks, this is very helpful!
-TCK
On Tue, Feb 16, 2010 at 8:16 PM, Ahmet Arslan iori...@yahoo.com wrote:
It seems that when I do a search with a wildcard (eg,
+text:abc*) the Solr
standard SearchHandler will construct a ConstantScoreQuery
passing in a
Filter, so all the documents in
Thats what I thought. I think I'll take the time to add something to the DIH to
prevent such things. Maybe a parameter that will cause the import to bail out
if the documents to index are less than X % of the total number of documents
already in the index.
There would also be a parameter to
Yup, thats also what I was thinking.
However, I do think that many real world examples cannot simply use one flat
index. If you have a big index with big documents, you may want to have a
separate, small index, for things that update frequently etc.. You would need
to cross reference that
http://www.webtide.com/choose/jetty.jsp
- Original Message -
From: Steve Radhouani r.steve@gmail.com
To: solr-user@lucene.apache.org
Sent: Tuesday, 16 February, 2010 12:38:04 PM
Subject: Tomcat vs Jetty: A Comparative Analysis?
Hi there,
Is there any analysis out
Hello,
Does Solr have any hooks that allow one to watch out for any slaves not
responding to a query request in the context of distributed search? That is,
if a query is sent to shards A, B, and C, and if B doesn't actually respond
(within N milliseconds), I'd like to know about it, and I'm
Anyone come up with an answer for this?
I am using the blacklight ruby app and seems to require multiple handlers for
different styles of queries.
In particular, what I am noticing is that the facet query using q=*:* seems to
produce a single shard answer.
This query produces 1 result and
I'll take a stab. IMHO, it doesn't make much sense to propagae the boost,
and here's why:
For the typical use case, copyField is used to add other searchable fields
into the default text field for Standard queries. Say we are copying the
ModelNumber field into the text field, and we have a boost
Sorry, for the chaos-posts, if someone minds :)
My Colleague posted more infos here:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201002.mbox/%3c4b7bf56e.3080...@freiheit.com%3e
I would be very pleased if you could response any idea to his post.
Regards,
Sven
-Ursprüngliche
Drop those cache numbers. Way down. I warm up 30 million documents in about 2
minutes with the following configuration:
documentCache
class=solr.FastLRUCache
size=128
initialSize=10
cleanupThread=true /
queryResultCache
This read more like a PR release or product brochure for jetty than anything
else.
Then I poked around the website and realized why: it was written by the creator
of Jetty, and is hosted on the website of a company with the slogan The Java
Experts behind Jetty
--- On Wed, 2/17/10,
Hi,
If i change the defaultSearchField in the core schema, do I need to
recreate the index?
Thanks,
Frederico
no, youre just changing how your querying the index, not the actual
index, you will need to restart the servlet container or reload the core
for the config changes to take effect tho
On 02/17/2010 10:04 AM, Frederico Azeiteiro wrote:
Hi,
If i change the defaultSearchField in the core
: I'm having a strange problem when indexing data through our application.
: Whenever I post something to the update resource, I get
:
: Unexpected character 'a' (code 97) in prolog; expected '' at [row,col
{unknown-source}]: [1,1], html
...
: However, when I post the same data from
: Thats what I thought. I think I'll take the time to add something to the
: DIH to prevent such things. Maybe a parameter that will cause the import
: to bail out if the documents to index are less than X % of the total
: number of documents already in the index.
the devils in the details
Certainly if you come up with a general solution, the whole community will
be *very* interested G.
On Wed, Feb 17, 2010 at 11:14 AM, Daniel Shane sha...@lexum.umontreal.cawrote:
Yup, thats also what I was thinking.
However, I do think that many real world examples cannot simply use one
: take a look at PositionFilter
Right, there was another thread recently where almost the exact same issue
was discussed...
http://old.nabble.com/Re%3A-Tokenizer-question-p27120836.html
..except that i was ignorant of the existence of PositionFilter when i
wrote that message.
-Hoss
: How do I SWAP the old_core with the new_core. Is it to be done manually or
: does solr provide with a command for doing so. What if I don't make a new
you use the SWAP command, as described in the URL that was mentioned...
: : http://wiki.apache.org/solr/CoreAdmin
:
: For making a schema
Can you share the full output from the StatsComponent?
On Feb 15, 2010, at 3:07 PM, solr-user wrote:
Has anyone encountered the following issue?
I wanted to understand the statscomponent better, so I setup a simple test
index with a few thousand docs. In my schema I have:
- an
Unfortunately the file request handler does not support bindary file
types (yet).
Lance's suggestion of hosting static content in another servlet
container context is the best solution for now.
Erik
On Feb 15, 2010, at 8:47 AM, Chantal Ackermann wrote:
Hi all,
Google didn't
Please bear with me on the limitted understanding.
i deleted all documents and i made a rebuild of my spell checker using the
command
spellcheck=truespellcheck.build=truespellcheck.dictionary=default
After this i went to the schema browser and i saw that mySpellText still has
around 2000
i think we can improve the docs/wiki to show this example use case, i
noticed the wiki explanation for this filter gives a more complex shingles
example, which is interesting, but this seems to be a common problem and
maybe we should add this use case.
On Wed, Feb 17, 2010 at 1:54 PM, Chris
: 54 results with that particular event on top. However, if I try to
: boost another term, such as +(old 97's) || granada^100 - I get over
: 300 results because it adds in all of the matches for the word
In Solr/Lucene, the keywords of AND and OR are really just syntactic
sugar for making
: in my solr u have 1,42,45,223 records having some 50GB .
: Now when iam loading a new record and when its trying optimize the docs its
: taking 2 much memory and time
: can any body please tell do we have any property in solr to get rid of this.
Solr isn't going to optimize the index unless
Grant Ingersoll-6 wrote:
Can you share the full output from the StatsComponent?
Sure. This is what I get.
?xml version=1.0 encoding=UTF-8 ?
- response
- lst name=responseHeader
int name=status0/int
int name=QTime62/int
- lst name=params
str name=faceton/str
str
Hello all,
At some point we will need to re-build an index that totals about 2 terrabytes
in size (split over 10 shards). At our current indexing speed we estimate that
this will take about 3 weeks. We would like to reduce that time. It appears
that our main bottleneck is disk I/O.
We
Burton-West, Tom wrote:
Hello all,
At some point we will need to re-build an index that totals about 2
terrabytes in size (split over 10 shards). At our current indexing speed we
estimate that this will take about 3 weeks. We would like to reduce that
time. It appears that our main
: Sure. This is what I get.
That does look really weird, and definitely seems like a bug.
Can you open an issue in Jira? ... ideally with a TestCase (even if it's
not a JUnit test case, just having some sample docs that can be indexed
against the example schema and a URL showing the problem
: Is it possible to do date faceting on multiple solr shards?
Distributed search doesn't currently support date faceting...
http://wiki.apache.org/solr/DistributedSearch#Distributed_Searching_Limitations
https://issues.apache.org/jira/browse/SOLR-1709
-Hoss
hossman wrote:
That does look really weird, and definitely seems like a bug.
Can you open an issue in Jira? ... ideally with a TestCase (even if it's
not a JUnit test case, just having some sample docs that can be indexed
against the example schema and a URL showing the problem would
Hi,
Have you tried playing with mergeFactor or even mergePolicy?
--
Jan Høydahl - search architect
Cominvent AS - www.cominvent.com
On 16. feb. 2010, at 08.26, Janne Majaranta wrote:
Hey Dipti,
Basically query optimizations + setting cache sizes to a very high level.
Other than that, the
After ZooKeeper is integrated (1.5?) there will be a way to get info about all
nodes in your cluster including their roles, status etc. Perhaps you want to
coordinate your dashboard effort with this version, although still very early
in development? See http://wiki.apache.org/solr/SolrCloud
--
Thanks for your clarification and link, Will.
Back to Aaron's question. There is some ongoing work to try to support updating
single fields within documents (http://issues.apache.org/jira/browse/SOLR-139
and http://issues.apache.org/jira/browse/SOLR-828) which could perhaps be part
of a future
simple question: I want to give a label to my facet queries instead of the
name of facet field .. i found the documentation at solr site that I can do
that by specifying the key local param .. syntax something like
facet.field={!ex=dt%20key='By%20Owner'}owner
I am just not sure what the ex=dt
What type are you posting with? Is it expecting a multipart upload?
What is the curl command and what is its mime-type for uploaded data?
On Wed, Feb 17, 2010 at 10:19 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:
: I'm having a strange problem when indexing data through our application.
I mean: what MIME type does the POST command use?
On Wed, Feb 17, 2010 at 7:09 PM, Lance Norskog goks...@gmail.com wrote:
What type are you posting with? Is it expecting a multipart upload?
What is the curl command and what is its mime-type for uploaded data?
On Wed, Feb 17, 2010 at 10:19
This is a quirk of Lucene - when you delete a document, the indexed
terms for the document are not deleted. That is, if 2 documents have
the word 'frampton' in an indexed field, the term dictionary contains
the entry 'frampton' and pointers to those two documents. When you
delete those two
That would be great. After reading this and the PositionFilter class I
still don't know how to use it.
On Wed, Feb 17, 2010 at 12:38 PM, Robert Muir rcm...@gmail.com wrote:
i think we can improve the docs/wiki to show this example use case, i
noticed the wiki explanation for this filter gives a
Here's the problem: the wiki page is confusing:
http://wiki.apache.org/solr/SimpleFacetParameters#Tagging_and_excluding_Filters
The line:
q=mainqueryfq=status:publicfq={!tag=dt}doctype:pdffacet=onfacet.field={!ex=dt}doctype
is standalone, but the later line:
facet.field={!ex=dt
okay so if I dont want to do any excludes then I am assuming I should just
put in {key=label}field .. i tried that and it doesnt work .. it says
undefined field {key=label}field
Lance Norskog-2 wrote:
Here's the problem: the wiki page is confusing:
hossman wrote:
: in my solr u have 1,42,45,223 records having some 50GB .
: Now when iam loading a new record and when its trying optimize the docs
its
: taking 2 much memory and time
: can any body please tell do we have any property in solr to get rid of
this.
Solr isn't going
Hi,
Yes, I did play with mergeFactor.
I didn't play with mergePolicy.
Wouldn't that affect indexing speed and possibly memory usage ?
I don't have any problems with indexing speed ( 1000 - 2000 docs / sec via
the standard HTTP API ).
My problem is that I need very warm caches to get fast
Totally agreed!
2010/2/17 Andy angelf...@yahoo.com
This read more like a PR release or product brochure for jetty than
anything else.
Then I poked around the website and realized why: it was written by the
creator of Jetty, and is hosted on the website of a company with the slogan
The Java
67 matches
Mail list logo