Thanks for your reply,
I need the Federated Search. You mean this is not yet
supported out of the box. So I have a question that
in this situation what can Collection Distribution used for?
Jarvis
-Original Message-
From: Ryan McKinley [mailto:[EMAIL PROTECTED]
Sent: Wednesday,
Hi ,
Product : Solr (Embedded)Version : 1.2
Problem Description :
While trying to add and search over the index, we are stumbling on this
error again and again.
Do note that the SolrCore is committed and closed suitably in our Embedded
Solr.
Error (StackTrace) :
Sep 19, 2007 9:41:41 AM
TestJettyLargeVolume.java
Description: Binary data
we were doing some performance testing for the updating aspects of solr and ran into what seems to be a large problem. we're creating small documents with an id and one field of 1 term only submitting them in batches of 200 with commits every
I am using Tomcat 6 and Solr 1.2 on a Windows 2003 server using the
following java code. I am trying to index pdf files, and I'm
constantly getting errors on larger files (the same ones).
SolrServer server = new CommonsHttpSolrServer(solrPostUrl);
SolrInputDocument addDoc =
What files are there in your /data/pub/index directory?
Bill
On 9/19/07, Venkatraman S [EMAIL PROTECTED] wrote:
Hi ,
Product : Solr (Embedded)Version : 1.2
Problem Description :
While trying to add and search over the index, we are stumbling on this
error again and again.
Do note
Quite inetersting actually (this is for 5 documents that were indexed) :
_0.fdt _0.prx _1.fnm _1.tis _2.nrm _3.fdx _3.tii _4.frq segments.gen
_0.fdx _0.tii _1.frq _2.fdt _2.prx _3.fnm _3.tis _4.nrm segments_6
_0.fnm _0.tis _1.nrm _2.fdx _2.tii _3.frq _4.fdt _4.prx
_0.frq
one other note. the errors pop up when running against the 1.3 trunk
but do not appear to happen when run against 1.2.
- will
On 9/19/07, Will Johnson [EMAIL PROTECTED] wrote:
we were doing some performance testing for the updating aspects of solr and
ran into what seems to be a large
Can you start a JIRA issue and attach the patch?
I have not seen this happen, but I bet it is caused by something from:
https://issues.apache.org/jira/browse/SOLR-215?page=com.atlassian.jira.plugin.ext.subversion:subversion-commits-tabpanel
Can we add that test to trunk? By default it does not
On Wed, 19 Sep 2007 01:46:53 -0400
Ryan McKinley [EMAIL PROTECTED] wrote:
Stu is referring to Federated Search - where each index has some of the
data and results are combined before they are returned. This is not yet
supported out of the box
Maybe this is related. How does this compare to
On 9/19/07, Norberto Meijome [EMAIL PROTECTED] wrote:
On Wed, 19 Sep 2007 01:46:53 -0400
Ryan McKinley [EMAIL PROTECTED] wrote:
Stu is referring to Federated Search - where each index has some of the
It really should be Distributed Search I think (my mistake... I
started out calling it
I have had this and other files index correctly using a different
combination version of Tomcat/Solr without any problem (using similar
code, I re-wrote it because I thought it would be better to use Solrj).
I get the same error whether I use a simple StringBuilder to created the
add manually or
Nutch implements federated search separately from their index generation.
My understanding is that MapReduce jobs generate the indexes (Nutch calls them
segments) from raw data that has been downloaded, and then makes them available
to be searched via remote procedure calls. Queries never pass
Daley, Kristopher M. wrote:
I have tried changing those settings, for example, as:
SolrServer server = new CommonsHttpSolrServer(solrPostUrl);
((CommonsHttpSolrServer)server).setConnectionTimeout(60);
((CommonsHttpSolrServer)server).setDefaultMaxConnectionsPerHost(100);
I'm stabbing in the dark here, but try fiddling with some of the other
connection settings:
getConnectionManager().getParams().setSendBufferSize( big );
getConnectionManager().getParams().setReceiveBufferSize( big );
Ok, I'll try to play with those. Any suggestion on the size?
Something else that is very interesting is that I just tried to do an
aggregate add of a bunch of docs, including the one that always returned
the error.
I called a function to create a SolrInputDocument and return it. I then
did the
Hi
We want to (mis)use facet search to get the number of (unique) field
values appearing in a document resultset.
I thought facet search perfect for this, because it already gives me
all the (unique) field values.
But for us to be used for this special problem, we don't want all the
values
Lance Norskog wrote:
I believe I saw in the Javadocs for Lucene that there is the ability to
return the unique values for one field for a search, rather than each
record. Is it possible to add this feature to Solr? It is the equivalent of
'select distinct' in SQL.
Look into faceting:
Anyone else using this, and finding it not working in Solr 1.2? Since
we've got an automated release process, I really need to be able to have
the appserver not see itself as done warming up until the firstSearcher
is ready to go... but with 1.2 this no longer seems to be the case.
adam
On 9/19/07, Adam Goldband [EMAIL PROTECTED] wrote:
Anyone else using this, and finding it not working in Solr 1.2? Since
we've got an automated release process, I really need to be able to have
the appserver not see itself as done warming up until the firstSearcher
is ready to go... but with
On 9/19/07, Laurent Hoss [EMAIL PROTECTED] wrote:
We want to (mis)use facet search to get the number of (unique) field
values appearing in a document resultset.
We have paging of facets, so just like normal search results, it does
make sense to list the total number of facets matching.
The
Hi out of there,
I just walked through the mailing list archive, but I did not find an
appropriate answer for phrase highlighting.
I do not have any highlighting section (and no dismax handler
definition) in solrconfig.xml. This way (AFAIK :-)), the standard lucene
query syntax should be
: Product : Solr (Embedded)Version : 1.2
: java.io.FileNotFoundException: no segments* file found in
: org.apache.lucene.store.FSDirectory@/data/pub/index: files:
According to that, the FSDirectory was empty when it ws opened (a file
list is suppose to come after that files: part)
you
On 19-Sep-07, at 1:12 PM, Marc Bechler wrote:
Hi out of there,
I just walked through the mailing list archive, but I did not find
an appropriate answer for phrase highlighting.
I do not have any highlighting section (and no dismax handler
definition) in solrconfig.xml. This way (AFAIK
: I noticed that the field list (fl) parameter ignores field names that it
: cannot locate, while the query fields (qf) parameter throws an exception
: when fields cannot be located. Is there any way to override this behavior and
: have qf also ignore fields it cannot find?
Those parameters are
On 19-Sep-07, at 2:39 PM, Marc Bechler wrote:
Hi Mike,
thanks for the quick response.
It would make a great project to get one's hands dirty
contributing, though :)
... sounds like giving a broad hint ;-) Sounds challenging...
I'm not sure about that--it is supposed to be a drop-in
: The main problem with implementing this is trying to figure out where
: to put the info in a backward compatible manner. Here is how the info
1) this seems like the kind of thing that would only be returend if
requested -- so we probably don't have to be overly concerned about
backwards
lance: since the topic you are describing is not directly related to
triggering a snapshot from the web interface can you please start a new
thread with a unique subejct describing in more details exactly what it
was you were doing and the problem you encountered?
this will make it easier for
Hi, there,
So we are using the Tomcat's JNDI method to set up multiple solr instances
within a tomcat server. Each instance has a solr home directory.
Now we want to set up collection distribution for all these solr home
indexes. My understanding is:
1. we only need to run rsync-start once use
However, if I go to the tomcat server and restart it after I have issued
the process command, the program returns and the documents are all
posted correctly!
Very strange behavioram I somehow not closing the connection
properly?
What version is the solr you are connecting to? 1.2 or
Hi, there,
I used an absolute path for the dir param in the solrconfig.xml as below:
listener event=postCommit class=solr.RunExecutableListener
str name=exesnapshooter/str
str name=dir/var/SolrHome/solr/bin/str
bool name=waittrue/bool
arr name=args strarg1/str
Nutch has two ways to make a distributed query - through HDFS(hadoop file
system) or RPC call that is in
org.apache.nutch.searcher.DistributedSearch class.
But I think these are both not good enough.
If we use HDFS to service the user's query. Stability is a problem. We must
all do the crawl ,
Hey all,
Let's say I have an index of one hundred documents, and these
documents are grouped into 4 groups A, B, C, and D. The groups do in
fact overlap. What would people recommend as the best way to apply a
search query and return only the documents that are in group A? Also,
how about
On Wed, 19 Sep 2007 10:29:54 -0400
Yonik Seeley [EMAIL PROTECTED] wrote:
Maybe this is related. How does this compare to the map-reduce
functionality in Nutch/Hadoop ?
map-reduce is more for batch jobs. Nutch only uses map-reduce for
parallel indexing, not searching.
I see... so in
I'm currently looking at methods of term extraction and automatic keyword
generation from indexed documents. I've been experimenting with
MoreLikeThis and values returned by the mlt.interestingTerms parameter and
so far this approach has worked well. However, I'd like to be able to
analyze
I think index data which stored in HDFS and generated by map-reduce function
is used for searching in NUTCH-0.9
You can see the code in org.apache.nutch.searcher.NutchBean class . :)
Jarvis
-Original Message-
From: Norberto Meijome [mailto:[EMAIL PROTECTED]
Sent: Thursday, September
On Thu, 20 Sep 2007 09:37:51 +0800
Jarvis [EMAIL PROTECTED] wrote:
If we use the RPC call in nutch .
Hi,
I wasn't suggesting to use nutch in solr...I'm only a young grasshopper in this
league to be suggesting architecture stuff :) but i imagine there's nothing
wrong with using what they've built
On Sep 19, 2007, at 9:58 PM, Pieter Berkel wrote:
I'm currently looking at methods of term extraction and automatic
keyword
generation from indexed documents.
We do it manually (not in solr, but we put the results in solr.) We
do it the usual way - chunk (into n-grams, named entities
Sounds like you're on the right track, if your groups overap (i.e. a
document can be in group A and B), then you should ensure your groups
field is multivalued.
If you are searching for foo in documents contained in group A, then it
might be more efficient to use a filter query (fq) like:
HI,
What you say is done by hadoop that support Hardware Failure、Data
Replication and some else .
If we want to implement such a good system by ourselves without HDFS
but Solr , it's a very very complex work I think. :)
I just want to know whether there is a component
Thanks Brian, I think the smart approaches you refer to might be outside
the scope of my current project. The documents I am indexing already have
manually-generated keyword data, moving forward I'd like to have these
keywords automatically generated, selected from a pre-defined list of
keywords
On 19-Sep-07, at 7:21 PM, Jarvis wrote:
HI,
What you say is done by hadoop that support Hardware Failure、Data
Replication and some else .
If we want to implement such a good system by ourselves without HDFS
but Solr , it's a very very complex work I think. :)
I just want
Hi, Pieter,
Thanks! Now the exception is gone. However, There's no snapshot file
created in the data directory. Strangely, the snapshooter.log seems to
complete successfully. Any idea what else I'm missing?
$ cat var/SolrHome/solr/logs/snapshooter.log
2007/09/19 20:16:17 started by solruser
If you don't need to pass any command line arguments to snapshooter, remove
(or comment out) this line from solrconfig.xml:
arr name=args strarg1/str strarg2/str /arr
By the same token, if you're not setting environment variables either,
remove the following line as well:
arr name=env
On Thu, 20 Sep 2007 10:02:08 +0800
Jarvis [EMAIL PROTECTED] wrote:
You can see the code in org.apache.nutch.searcher.NutchBean class . :)
thx for the pointer.
_
{Beto|Norberto|Numard} Meijome
In order to avoid being called a flirt, she always yielded easily.
Charles,
On Thu, 20 Sep 2007 10:21:39 +0800
Jarvis [EMAIL PROTECTED] wrote:
What you say is done by hadoop that support Hardware Failure、Data
Replication and some else .
If we want to implement such a good system by ourselves without HDFS
but Solr , it's a very very complex work I think.
Along similar lines :
assuming that i have 2 indexes in the same box , say at :
/home/abc/data/index1 and /home/abc/data/index2,
and i want the results from both the indexes when i do a search - then how
should this be 'optimally' designed - basically these are different Solr
homes and i want
On 9/20/07, Chris Hostetter [EMAIL PROTECTED] wrote:
you imply that you are building your index using embedded solr, but based
on your stack trace it seems you are using Solr in a servlet container ...
i assume to search the index you've already built?
I have a jsp that routes the info
47 matches
Mail list logo