As far as I know, Maven is a build/mgmt tool for java projects quite similar
to Ant, right? No I'm not using this , then I think I don't need to worry
about those pom files.
But I'm still not able to figure out the error with classpath/jar files I
mentioned in my previous mails. Shall I try
the Solr distro contains all the jar files. you can take either the
latest release (1.3) or a nightly
On Tue, Apr 28, 2009 at 11:34 AM, ahmed baseet ahmed.bas...@gmail.com wrote:
As far as I know, Maven is a build/mgmt tool for java projects quite similar
to Ant, right? No I'm not using this ,
If you use CharFilter, you should use CharStream aware Tokenizer to
correct terms offsets.
There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
Probably you want to use CharStreamAwareCJKTokenizer(Factory).
Koji
Ashish P wrote:
After this should I be using same cjkAnalyzer or use
Hi Matt,
On Tue, Apr 28, 2009 at 4:24 AM, Matt Mitchell goodie...@gmail.com wrote:
I've been toying with setting custom pre/post delimiters and then removing
them in the client, but I thought I'd ask the list before I go to far with
that idea :)
this is what I do. I define the custom
HI,
I was trying to extract content from an xlsx file for indexing.
However, I am getting julian date value for a cell with date format and '1.0'
in place of '100%'.
I want to retain the value as present in that xlsx file.
Solution appreciated.
Thanks,
Koushik
CAUTION -
Koji san,
Using CharStreamAwareCJKTokenizerFactory is giving me following error,
SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
org.apache.solr.analysis.CharStream
May be you are typecasting Reader to subclass.
Thanks,
Ashish
Koji Sekiguchi-2 wrote:
If you use
Hey there,
I needed to have a multiple date facet functionality. Like say for example
to show the latests results in the last day, last week and last month. I
wanted to do it with just one query.
The date facet part of solrconfig.xml would look like:
str name=facet.datedate_field/str
The exception is expected if you use CharStream aware Tokenizer without
CharFilters.
Please see example/solr/conf/schema.xml for the setting of CharFilter and
CharStreamAware*Tokenizer:
!-- charFilter + CharStream aware WhitespaceTokenizer --
!--
Is it possible to read only maxAnalyzedChars from the stored field
instead of reading the complete field in the memory? For instance, in my
case, is it possible to read only first 50K characters instead of
complete 1 MB stored text? That will help minimizing the memory usage
(Though, it will still
Koushik,
You didn't say much about how you are doing the extraction. Note that Solr
doesn't do any extraction from spreadsheets, even though it has a component
(known as Solr Cell) to provide that interface. The actual extraction is done
by a tool called Tika, or more precisely, POI, both
On Mon, Apr 27, 2009 at 10:27 PM, Jon Bodner jbod...@blackboard.com wrote:
Trying to point multiple Solrs on multiple boxes at a single shared
directory is almost certainly doomed to failure; the read-only Solrs won't
know when the read/write Solr instance has updated the index.
I'm
On Apr 24, 2009, at 1:54 AM, sagi4 wrote:
Can i get the rake task for clearing the index of solr, I mean rake
index::rebuild, It would be very helpful and also to avoid the
delete id by
manually.
How do you currently build your index?
But making a Rake task to do perform Solr operations
Yes... at least I think so. the highlighting works correctly for me on
another request handler... see below the request handler for my
morelikethishandler query.
Thanks for your help... Eric
requestHandler name=/mlt class=solr.MoreLikeThisHandler
lst name=defaults
str name=fl
Hi Christian,
I decided to do something very similar. How do you handle cases where the
highlighting is inside of html/xml tags though? I'm getting stuff like this:
?q=jackson
entry type=song author=Michael emJackson/emBad by Michael
emJackson/em/entry
I wrote a regular expression to take care
How are you indexing it? A sample of the CSV file would be helpful.
Note that while the CSV update handler is very convenient and very
fast, it also doesn't have much in the way of data massaging/
transformation - so it might require you pre-format the data for Solr
ingestion, or have a
Hi,
You should probably just look at the index version number to figure out if the
name changed. If you are looking at segments.gen, you are looking at a file
that may not exist in Lucene in the future. Use IndexReader API instead.
By refreshes do you mean reopened a new Searcher? Does
Amit,
You might want to take a look at LuSql[1] and see if it may be
appropriate for the issues you have.
thanks,
Glen
[1]http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
2009/4/27 Amit Nithian anith...@gmail.com:
All,
I have a few questions regarding the data import
Thank you very much. Now its working fine, fixed those minor classpath
issues.
Thanks,
Ahmed.
2009/4/28 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com
the Solr distro contains all the jar files. you can take either the
latest release (1.3) or a nightly
On Tue, Apr 28, 2009 at 11:34 AM,
On Apr 28, 2009, at 9:49 AM, ahammad wrote:
Is it possible for Solr to assign a unique number to every document?
Solr has a UUIDField that can be used for this. But...
For example, let's say that I am indexing from several databases with
different data structures. The first one has a
To add to that :
This issue was coming because of the commit script called internally by
snapinstaller . Commit script creates the solr url to do the comit as shown
below:
curl_url=http://${solr_hostname}:${solr_port}/${webapp_name}/update
commitscript logs:
2009/04/28 18:48:21
To add to that :
This issue was coming because of the commit script called internally by
snapinstaller . Commit script creates the solr url to do the comit as shown
below:
curl_url=http://${solr_hostname}:${solr_port}/${webapp_name}/update
commitscript logs:
2009/04/28 18:48:21
Just an FYI: I've never tried, but there seems to be RSS feed sample in DIH:
http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476
Koji
Tom H wrote:
Hi,
I've just downloaded solr and got it working, it seems pretty cool.
I have a project which needs to
Thanh Doan wrote:
Assuming a solr search returns 10 listing items as below
1) 4 digital cameras
2) 4 LCD televisions
3) 2 clothing items
If we navigate to /electronics we want solr to show
us facets specific to 8 electronics items (e.g brand, price).
If we navigate to
I see you are using firstSearcher/newSearcher event listener on your
startup and cause the problem.
If you don't need them, commented out them in solrconfig.xml.
Koji
Eric Sabourin wrote:
I’m using SOLR 1.3.0 (from download, not a nightly build)
apache-tomcat-5.5.27 on Windows XP.
When
Wow, this looks great. Thanks for this Koji!
Matt
On Tue, Apr 28, 2009 at 12:13 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:
Thanh Doan wrote:
Assuming a solr search returns 10 listing items as below
1) 4 digital cameras
2) 4 LCD televisions
3) 2 clothing items
If we navigate to
I think this is a bug.
I looked at the classes SnapShooter, and it's constructor looks like this:
public SnapShooter(SolrCore core) {
solrCore = core;
}
This leaves the variable snapDir to be null, and the variable is never
initialized elsewhere, and later in the function
I am using MS SQL server and want to index a table.
I setup my data-config like this:
dataConfig
dataSource type=JdbcDataSource batchSize=25000
autoCommit=true
driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
On Tue, Apr 28, 2009 at 3:18 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:
Hi,
You should probably just look at the index version number to figure out if
the name changed. If you are looking at segments.gen, you are looking at a
file that may not exist in Lucene in the future.
Did you define all the fields that you used in schema.xml?
Ci-man wrote:
I am using MS SQL server and want to index a table.
I setup my data-config like this:
dataConfig
dataSource type=JdbcDataSource batchSize=25000
autoCommit=true
Hi,
I have been trying to solve a performance issue: I have an index of hotels with
their ids and another index of reviews. Now, when someone queries for a
location, the current process gets all the hotels for that location.
And, then corresponding to each hotel-id from all the hotel documents,
I do remember LuSQL and a discussion regarding the performance implications
of using it compared to the DIH. My only reason to stick with DIH is that we
may have other data sources for document loading in the near term that may
make LuSQL too specific for our needs.
Regarding the bug to write to
I began a similar thread under the subject Distinct terms in facet field.
One thing I noticed though is that your fields seem to have a lot of controlled
values, or lack free text. Are you sure SOLR is what you should be using?
Perhaps a traditional RDB would be better and then you would have
Have you considered indexing the reviews along with the hotels right
in the hotel index? That way you would fetch the reviews right along with
the hotels...
Really, this is another way of saying flatten your data G...
Your idea of holding all the hotel reviews in memory is also viable,
depending
After posting this question I found this discussion
http://www.nabble.com/Hierarchical-Facets--to7135353.html.
So what I did was adapting the scheme with 3 fields; cat,
subcat,subsubcat and hardcoded the hierarchical logic in the UI layer
to present hierarchical taxonomy for the users.
The
That didn´t work either.
All my libraries are at /Applications/tomcat/webapps/solr/WEB-INF/lib
So is apache-solr-dataimporthandler-1.3.0.jar
However I did create a new /lib directory under my solr home at
/Applications/solr and copied the jar to that location as well.
But no difference.
Here
Hi Tim,
Thanks for your reply. The index structure in my original post is just an
example. We do have many free text fields with different analyzers.
I checked your post Distinct terms in facet field, but I think the issues
we try to address are different. Yours is to get distinct terms in the
: In trying to understand the various options for
: WordDelimiterFilterFactory, I tried setting all options to 0. This seems
: to prevent a number of words from being output at all. In particular
: can't and 99dxl don't get output, nor do any wods containing hypens.
: Is this correct
: The exception is expected if you use CharStream aware Tokenizer without
: CharFilters.
Koji: i thought all of the casts had been eliminated and replaced with
a call to CharReader.get(Reader) ?
: Please see example/solr/conf/schema.xml for the setting of CharFilter and
:
: Anyone able to help with the question below?
dealing with fl is a delicate dance in Solr right now .. complicated by
both FieldSelector logic and distributed search (where both DocList and
SolrDocumentList objects need to be dealt with).
I looked at this recently and even I can't remember
Chris Hostetter wrote:
: The exception is expected if you use CharStream aware Tokenizer without
: CharFilters.
Koji: i thought all of the casts had been eliminated and replaced with
a call to CharReader.get(Reader) ?
Yeah, right. After r758137, ClassCastException should be eliminated.
Hi,
I'm attempting to serialize a simple ruby object into a solr.StrField - but
it seems that what I'm getting back is munged up a bit, in that I can't
de-serialize it. Is there a field type for doing this type of thing?
Thanks,
Matt
Ankush,
It seems that unless reviews are changing constantly, why not do what Erick
was saying in flattening your data by storing reviews with the hotel index
but re-index your hotels storing the top two reviews. I guess I am
suggesting computing the top two reviews for each hotel offline and
Thank you Erik..
Should I write the below code in rake task /lib/tasks/solr.rake?
I am newbie to ruby.
Erik Hatcher wrote:
On Apr 24, 2009, at 1:54 AM, sagi4 wrote:
Can i get the rake task for clearing the index of solr, I mean rake
index::rebuild, It would be very helpful and also to
Ankush,
Your approach works. Fire a in query on the review index for all hotel ids
you care about. Create a map of hotel to its reviews.
Cheers
Avlesh
On Wed, Apr 29, 2009 at 8:09 AM, Amit Nithian anith...@gmail.com wrote:
Ankush,
It seems that unless reviews are changing constantly, why not
writing to a remote Solr through SolrJ is in the cards. I may even
take it up after 1.4 release. For now your best bet is to override the
class SolrWriter and override the corresponding methods for
add/delete.
On Wed, Apr 29, 2009 at 2:06 AM, Amit Nithian anith...@gmail.com wrote:
I do remember
I need a function (through solr ruby) for ruby that will allow us to
clear everything
regards,
Sg..
Geetha wrote:
Thank you Erik..
Should I write the below code in rake task /lib/tasks/solr.rake?
I am newbie to ruby.
Erik Hatcher wrote:
On Apr 24, 2009, at 1:54 AM, sagi4 wrote:
Can i
is the serialized data in UTF-8 string?
On Wed, Apr 29, 2009 at 6:42 AM, Matt Mitchell goodie...@gmail.com wrote:
Hi,
I'm attempting to serialize a simple ruby object into a solr.StrField - but
it seems that what I'm getting back is munged up a bit, in that I can't
de-serialize it. Is there
47 matches
Mail list logo