I am starting to look at Solr's Data Import Handler framework and am quite
impressed with it so far. My question is in trying to reduce the number of
SQL queries issued to the database and saw this entity processor.
In the following example:
entity name=x query=select * from x
entity name=y
Hello,
I get Unknown field error when I'm indexing an Oracle dB. I've reduced the
number of fields/columns in order to troubleshoot. If I change the uniqeKey
to timestamp (for example) and create a dynamic field dynamicField name=*
type=text indexed=true stored=true the indexing works fine,
On Tue, 25 Nov 2008 03:59:31 +0200
Timo Sirainen [EMAIL PROTECTED] wrote:
would it be faster to say q=user:user AND highestuid:[ * TO *] ?
Now that I read again what fq really did, yes, sounds like you're right.
you may want to compare them both to see which one is better... I just went
On Mon, Nov 24, 2008 at 7:56 PM, rameshgalla [EMAIL PROTECTED]wrote:
1)Which languages solr supports out-of-the box other than english?
Solr does not know about any languages. It will apply whatever analyzers you
specify in the schema.xml for that field type.
2)What are the
which version of DIH are you using?
On Tue, Nov 25, 2008 at 5:24 PM, Joel Karlsson [EMAIL PROTECTED] wrote:
Hello,
I get Unknown field error when I'm indexing an Oracle dB. I've reduced the
number of fields/columns in order to troubleshoot. If I change the uniqeKey
to timestamp (for example)
I actually don't know which version I was using, but now I've upgraded to
1.3 and it works like a charm!! Thanks a lot!
2008/11/25 Noble Paul നോബിള് नोब्ळ् [EMAIL PROTECTED]
which version of DIH are you using?
On Tue, Nov 25, 2008 at 5:24 PM, Joel Karlsson [EMAIL PROTECTED]
wrote:
Hello,
Even if you go for the 400,000 documents way, the size of data and number of
unique tokens would remain the same. With your data size, you should think
about sharding and distributed search.
Is the availability of a product a boolean value or the number of items? To
make sure that you don't need
https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg
https://issues.apache.org/jira/secure/attachment/12394070/sslogo-solr-finder2.0.png
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png
https://issues.apache.org/jira/secure/attachment/12394376/solr_sp.png
https://issues.apache.org/jira/secure/attachment/12393936/logo_remake.jpg
On Tue, Nov 25, 2008 at 7:49 AM, souravm [EMAIL PROTECTED] wrote:
3. Another case is - if there are 2 search requests concurrently hitting
the server, each with sorting on the same 20 character date field, then also
it would need 2x2GB memory. So if I know that I need to support at least 4
The easiest solution would be to create the documents you send to solr
with multiple keywords fields... they will be separated by a
positionIncrement so a phrase query won't see yankees adjacent to
cleveland.
If you can't do that, then perhaps patch PatternTokenizer filter to
put a larger
On Tue, Nov 25, 2008 at 1:52 PM, Amit Nithian [EMAIL PROTECTED] wrote:
I like the concept of having multiple entity blocks for clarity but why
wouldn't I have (for DB efficiency), the following as one entity's SQL
statement select * from X,Y where x.id=y.xid and have two fields
pointing
at
On Mon, Nov 24, 2008 at 11:51 PM, Timo Sirainen [EMAIL PROTECTED] wrote:
DIH seems to be about Solr pulling data into it from an external source.
That's not really practical with Dovecot since there's no central
repository of any kind of data, so there's no way to know what has
changed since
Hi Tom,
I don't think anybody has worked on adding this to Solr yet. Do you mind
opening a jira issue?
On Tue, Nov 25, 2008 at 12:01 AM, Burton-West, Tom [EMAIL PROTECTED]wrote:
Hello all,
We are having problems with extremely slow phrase queries when the
phrase query contains a common
every row emitted by an outer entity results in a new Sql query in the
inner entity. (yes 50 queries on inner entity)So,if you wish to
join multiple tables then nested entities is the way to go.
CachedSqlEntityProcessor is meant to help you reduce the number of
queries fired on sub-entities.
Hi Shalin,
Thanks for the clarifications.
Could you please explain a bit more on how the new searcher can double the
memory ?
Based on your explanation, when a new set of documents gets committed a new
searcher is created. So what I understand is whenever a update/delete query and
search
This is probably severe user error, but I am curious about how to index docs
to make this query work:
happy birthday
to return the doc with n_name:Happy Birthday before the doc with
n_name:Happy Birthday, Happy Birthday . As it is now, the latter appears
first for a query of n_name:happy
On Tue, Nov 25, 2008 at 9:37 PM, souravm [EMAIL PROTECTED] wrote:
Could you please explain a bit more on how the new searcher can double the
memory ?
Take a look at slide 13 of Yonik's presentation available at
http://people.apache.org/~yonik/ApacheConEU2006/Solr.ppt
Each searcher in Solr
On Nov 25, 2008, at 11:40 AM, Brian Whitman wrote:
This is probably severe user error, but I am curious about how to
index docs
to make this query work:
happy birthday
to return the doc with n_name:Happy Birthday before the doc with
n_name:Happy Birthday, Happy Birthday . As it is now, the
Thanks for the responses. Few follow-ups:
1) It seems that the CachedSQLEntityProcessor performs the where clause in
memory on the cache. Is this cache an in memory RDBMS or maps?
2) In the example, there were two use cases, one that is like query=select
* from Y where xid=${X.ID} and another
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png
https://issues.apache.org/jira/secure/attachment/12394475/solr2_maho-vote.png
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg
https://issues.apache.org/jira/secure/attachment/12394282/solr2_maho_impression.png
https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg
https://issues.apache.org/jira/secure/attachment/12394266/apache_solr_b_red.jpg
https://issues.apache.org/jira/secure/attachment/12394314/apache_soir_001.jpg
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg
I wasn't able to find examples/anything via google so thought I'd ask:
Say I want to implement a solution using distributed searches with many
shards in SOLR 1.3.0. Also, say there are too many shards to pass in
via the URL (dozens, hundreds, whatever)
Is there a way to specify in
Hi all,
Strugling with a question I recently got from a collegue: is it possible
to extract keywords from indexed content?
In my opinion it should be possible to find out on what words the
ranking of the indexed content is the highest (Lucene or Solr), but have
no clue where to begin. Anyone
lots of approaches out there...
the easiest off the shelf method would be to use the
MoreLikeThisHandler and get the top interesting terms;
http://wiki.apache.org/solr/MoreLikeThisHandler
ryan
On Nov 25, 2008, at 2:09 PM, Plaatje, Patrick wrote:
Hi all,
Strugling with a question I
On Tue, 2008-11-25 at 20:45 +0530, Shalin Shekhar Mangar wrote:
On Mon, Nov 24, 2008 at 11:51 PM, Timo Sirainen [EMAIL PROTECTED] wrote:
DIH seems to be about Solr pulling data into it from an external source.
That's not really practical with Dovecot since there's no central
repository
Hi,
I am trying to implement a spell check functionality on a
particular field. I need to do a complete phrase spell check when user
enters multiple words.
For eg: If the user enters great Hyat the current implementation would
suggest great Hyatt, just correcting the word hyatt.
Hello guys,
I am getting some stuck threads on my application when it connects to Solr.
The stuck threads occur in an even time, in such a way that each 3 days the
app is online it hangs up the entire cluster.
I don't know if there's any direct relation to Solr, but I get the following
exception
This sounds exactly same issue I had when going from 1.3 to 1.4 ... it
sounds like DIH is trying to automagically figure out the columns :-\
- Jon
On Nov 25, 2008, at 6:37 AM, Joel Karlsson wrote:
Hello,
I get Unknown field error when I'm indexing an Oracle dB. I've
reduced the
number of
On Mon, 24 Nov 2008 13:31:39 -0500
Burton-West, Tom [EMAIL PROTECTED] wrote:
The approach to this problem used by Nutch looks promising. Has anyone
ported the Nutch CommonGrams filter to Solr?
Construct n-grams for frequently occuring terms and phrases while
indexing. Optimize phrase
On Wed, 26 Nov 2008 10:08:03 +1100
Norberto Meijome [EMAIL PROTECTED] wrote:
We didn't notice any severe performance hit but :
- data set isn't huge ( ca 1 MM docs).
- reindexed nightly via DIH from MS-SQL, so we can use a separate cache layer
to lower the number of hits to SOLR.
To make
We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working
the garbage collector a lot more. Has anyone else seen this?
wunder
On Tue, Nov 25, 2008 at 7:56 PM, Walter Underwood
[EMAIL PROTECTED] wrote:
We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working
the garbage collector a lot more. Has anyone else seen this?
During indexing or searching?
Indexing uses the SolrDocument class as an intermediate
Searching. No facets, but fuzzy matching. --wunder
On 11/25/08 5:08 PM, Yonik Seeley [EMAIL PROTECTED] wrote:
On Tue, Nov 25, 2008 at 7:56 PM, Walter Underwood
[EMAIL PROTECTED] wrote:
We are moving from Solr 1.1 to 1.3, and have noticed that 1.3 is working
the garbage collector a lot more.
Hello,
I am using copyField to send the raw name of an entity into different
fields for indexing:
# schema.xml snippet
field name=raw_name type=string indexed=false stored=true /
field name=indexed_name type=some_custom_type indexed=true
stored=true /
field name=other_indexed_name
On Tue, Nov 25, 2008 at 9:24 PM, Michael Henson
[EMAIL PROTECTED] wrote:
I set the indexed fields to be stored so that I could see what exactly
my custom types' filters produce. In the Analyzer utility in the Admin
webapp seems to apply the filters properly. However, query results
against this
On Tue, Nov 25, 2008 at 11:35 PM, Amit Nithian [EMAIL PROTECTED] wrote:
Thanks for the responses. Few follow-ups:
1) It seems that the CachedSQLEntityProcessor performs the where clause in
memory on the cache. Is this cache an in memory RDBMS or maps?
It is a hashmap in memory
2) In the
anything that is passed as a request parameter can be put into the
SearchHandlers defaults or invariants section .
This is equivalent to passing the shard url in the request
However this expects that you may need to setup a loadbalancer if a
shard hhos more than one host
On Wed, Nov 26, 2008
I am having some trouble to utilize the facet Query. As I know that the
facet Query has better performance that simple query (q).
Here is the example.
40 matches
Mail list logo