On Wed, May 21, 2008 at 6:27 AM, Julio Castillo [EMAIL PROTECTED] wrote:
I wanted to learn how to index data that I have on my dB.
I followed the instructions on the wiki page for the Data Import Handler
(Full Import Example -example-solr-home.jar). I got an exception running it
as is (see
Hi all,
is there a way to let Solr not only return the total number of found articles,
but also the data of the last document when for example only requesting the
first 10 documents ?
we could do this with a seperate query by either letting the second query fetch
1 row from position =
Stopwords are commonly occurring words that don't add _much_ value to
search, such as the, an, a and are usually removed during analysis.
Protwords (protected words) are words that would be stemmed by the
English porter stemmer that you do not want to be stemmed.
In the end, removing
Hi
We currently host index of size approx 12GB on 5 SOLR slaves machines, which
are load balanced under cluster. At some point of time, which is after 8-10
hours, some SOLR slave would give Out of memory error, after which it just
stops responding, which then requires restart and after restart
Thank you very much for such a detailed reply. can you please tell me how
can i interact with solr from within my Java/JSP application ? I mean how to
query the solr running at localhost and getting results back in the
application. Do i have to change something there in solrconfig.xml ? Please
Just to add more:
The JVM heap allocated is 6GB with initial heap size as 2GB. We use
quadro(which is 8 cpus) on linux servers for SOLR slaves.
We use facet searches, sorting.
document cache is set to 7 million (which is total documents in index)
filtercache 1
gurudev wrote:
Hi
We
On Mon, May 19, 2008 at 2:49 PM, Chris Hostetter
[EMAIL PROTECTED] wrote:
: solr release in some time, would it be worth looking at what outstanding
: issues are critical for 1.3 and perhaps pushing some over to 1.4, and
: trying to do a release soon?
That's what is typically done when the
It is difficult to say such a thing when we consider that Solr is developed
by voluntaries that use their free time or time as part of a working project
to dedicate to Solr.
I think that Solr development is giving us outstanding results.
2008/5/21 Dan Thomas [EMAIL PROTECTED]:
On Mon, May 19,
Hi,
2008/5/21 Dan Thomas [EMAIL PROTECTED]:
One year between releases is a very long time for such a useful and
dynamic system. Are project leaders willing to (re)consider the
development process to prioritize improvements/features scope into
chunks that can be accomplished in shorter time
Hi,
I have incoming field stored both as Text and String field in solr indexed
data. When I search the following cases, string field returns documents(from
Solr client) and not text fields.
NAME:T - no results
Name_Str:T - returns documents
Similarly for the following cases - CPN*, DPS*, S,
Hello Chris,
it sounds like you only attempted tweaking the boost value, and not
tweaking the function params ... you can change the curve so that really
new things get a large score increase, but older things get less of an
increase.
recip(rord(creationDate),1,a,b)^w
I was tweaking the
On Wed, May 21, 2008 at 7:40 PM, Andrew Savory [EMAIL PROTECTED] wrote:
Hi,
2008/5/21 Dan Thomas [EMAIL PROTECTED]:
One year between releases is a very long time for such a useful and
dynamic system. Are project leaders willing to (re)consider the
development process to prioritize
Ezra Epstein wrote:
str name=fqstoreAvailableDate:[* TO NOW]/str
str name=fqstoreExpirationDate:[NOW TO *]/str
...
This works perfectly. Only trouble is that the two data fields may
actually be empty, in which case this filters out such records and we
want to include them.
I
As a work-around that'd work. It means either changing the contents of
the data sets or changing the schema and how data are fed to
SOLR/Lucene.
I'm hoping to be able to put an expression in the fq param instead, if
that's supported.
-Original Message-
From: Daniel Papasian
Noble Paul,
I took a look at the jar files included in the nightly builds and they do
not include the dataimport.jar content. So, I assume then that my best
approach is to download the corresponding dataimport sources used and build
my own dataimport.jar?
Thanks
** julio
-Original
OK, I just downloaded the source tree and discovered that the sources for
the dataimport handler are not there.
I guess I have to download the SOLR-469-contrib.patch
I suppose that later the source tree will have a contrib directory formally
and not as a patch?
Thanks
** julio
-Original
You have to excuse me here, but I can't find the contrib sources. I have
nothing the apply the patch to.
I used the following URL to get the SVN sources (per the website):
http://svn.apache.org/repos/asf/lucene/solr/.
Sorry, I'm a newbie with Solr, but intend to use it to index my data on the
Hi Julio,
Please download the SOLR-469.patch (not the contrib patch) from the
SOLR-469 jira issue and apply it to the latest trunk code. I apologize
for not keeping the example in the wiki in sync with the latest code.
Please let us know here if you face a problem.
On Wed, May 21, 2008 at 10:46
Hi Akeel,
Take a look at SolrJ which is a Java client library for Solr. It is
packaged with the Solr nightly binary downloads. This can be used by
your Java/JSP application to add documents or query Solr. No changes
to any config files is needed.
On Wed, May 21, 2008 at 5:15 PM, Akeel [EMAIL
Here's the link to wiki documentation on SolrJ
http://wiki.apache.org/solr/Solrj
On Wed, May 21, 2008 at 11:09 PM, Shalin Shekhar Mangar
[EMAIL PROTECTED] wrote:
Hi Akeel,
Take a look at SolrJ which is a Java client library for Solr. It is
packaged with the Solr nightly binary downloads.
: I'm hoping to be able to put an expression in the fq param instead, if
: that's supported.
you have to invert your logic. docs that have not yet expired, or will
never expire match the negacted query for docs expired in the past...
fq = -storeExpirationDate:[* TO NOW]
-Hoss
But that means that it can't fit all documents in the cache, doesn't
it? The index is 12GB and your allocated heap is 6GB... 12GB 6GB...
/Jimi
Quoting gurudev [EMAIL PROTECTED]:
Just to add more:
The JVM heap allocated is 6GB with initial heap size as 2GB. We use
quadro(which is 8 cpus)
On 21-May-08, at 2:35 AM, Tim Mahy wrote:
Hi all,
is there a way to let Solr not only return the total number of found
articles, but also the data of the last document when for example
only requesting the first 10 documents ?
we could do this with a seperate query by either letting the
: One year between releases is a very long time for such a useful and
: dynamic system. Are project leaders willing to (re)consider the
: development process to prioritize improvements/features scope into
: chunks that can be accomplished in shorter time frames - say 90 days?
: In my experience,
I had the same problem some weeks before. You can try these:
1. Check the hit ratio for the cache via the solr/admin/stats.jsp. If
the hit ratio is very low. Just disable those cache. It will save you
some memory.
2. set -Xms and -Xmx to the same size will help improve GC performance.
3. Check
: Thats true, but that's not the problem. The problem is that you can't call
: qt=spellchecker if you redefine /select in solrconfig.xml. I was wondering
: how I could add qt functionality back.
If you override /select to bind it to a specific handler, then you lose
the abiliy to pick a handler
:
: I'm indexing pages from multiple domains. In any given
: result set, I don't want to return more than two links
: from the same domain, so that the first few pages won't
: be all from the same domain. I suppose I could get more
: (say, 100) pages from solr, then sort in memory in the
:
Sorry. But how field collapsing works? Is there documentation about this
anywhere? Thanks!
On Wed, May 21, 2008 at 7:02 PM, Chris Hostetter [EMAIL PROTECTED]
wrote:
:
: I'm indexing pages from multiple domains. In any given
: result set, I don't want to return more than two links
: from the
There is a documentation:
http://wiki.apache.org/solr/FieldCollapsing
Koji
Jonathan Ariel wrote:
Sorry. But how field collapsing works? Is there documentation about this
anywhere? Thanks!
Not sure, but try using:
deletequerydocument_id:A-395 OR document_id:A-1949/query/delete
On Thu, May 22, 2008 at 7:46 AM, Tracy Flynn
[EMAIL PROTECTED] wrote:
I'm trying to exploit 'Delete by Query' with multiple IDs in the query.
I'm using vanilla SOLR 1.2
My schema specifies.
thanks everyone
On Thu, May 22, 2008 at 7:18 AM, Grant Ingersoll [EMAIL PROTECTED]
wrote:
See http://lucene.apache.org/solr/tutorial.html. You can also see the
wiki for a whole bunch of docs, including links to tutorials, etc.
Also, just for future reference, please separate out questions
I'm trying to configure a document config file using the example
data-config.xml mentioned in the wiki.
One question I have is when to nest the entity tags/nodes in the xml file?
The proposed example has them nested as
document
entity
entity
/entity
entity
Actually, the best documentation are really the comments in the JIRA issue
itself.
Is there anyone actually using Solr with this patch?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Koji Sekiguchi [EMAIL PROTECTED]
To:
Hi Julio,
Entities are nested when they have parent-child relationships as in a
SQL Join. For example, if your product has categories, you will create
an entity for products and a child entity for categories. However, if
your entities are totally independent of each other, then you can keep
them
Julio,
This is to convert the 1:n and m:n relationships in a DB to
multivalued fields in solr. A single sql query ends up giving a 2D
matrix where each cell holds one value. It would be harder to
denormalize and extract the multivalued fields from a single result
set. Check the architecture to
35 matches
Mail list logo