On 4/5/2010 8:43 PM, Mark Miller wrote:
On 04/05/2010 10:12 PM, Chris Hostetter wrote:
: The best you have to work with at the moment is Xincludes:
:
: http://wiki.apache.org/solr/SolrConfigXml#XInclude
:
: and System Property Substitution:
:
:
On 4/5/2010 8:43 PM, Mark Miller wrote:
On 04/05/2010 10:12 PM, Chris Hostetter wrote:
: The best you have to work with at the moment is Xincludes:
:
: http://wiki.apache.org/solr/SolrConfigXml#XInclude
:
: and System Property Substitution:
:
:
On 4/5/2010 8:12 PM, Chris Hostetter wrote:
what you cna do however, is have a distinct solrconfig.xml for each core,
which is just a thin shell that uses XInclude to include big chunkcs of
frequently reused declarations, and some cores can exclude some of thes
includes. (ie: turn the problem
On 4/7/2010 9:16 AM, Shawn Heisey wrote:
On 4/5/2010 8:12 PM, Chris Hostetter wrote:
what you cna do however, is have a distinct solrconfig.xml for each
core,
which is just a thin shell that uses XInclude to include big chunkcs of
frequently reused declarations, and some cores can exclude some
On 4/8/2010 2:11 AM, Mark N wrote:
Is it possible to use solr DataImportHandler when that database fields are
not fixed ? As per my findings we need to configure which table ( entity)
we will read the data and must match which fields in database will map to
fields in solr schema
Since in my
On 4/8/2010 7:05 AM, Shawn Heisey wrote:
Here's what I'm using as the query in my latest config:
Actually, that was three separate queries:
query=SELECT * FROM ${dataimporter.request.dataTable} WHERE did gt;
${dataimporter.request.minDid} AND did lt;=
${dataimporter.request.maxDid
On 4/7/2010 9:26 PM, bbarani wrote:
Hi,
I am currently using DIH to index the data from a database. I am just trying
to figure out if there are any other open source tools which I can use just
for indexing purpose and use SOLR for querying.
I also thought of writing a custom code for
On 4/8/2010 1:15 PM, Chris Hostetter wrote:
...i suspect you want something like...
xi:include href=handlers.xml xpointer=//requestHandler /
where handlers.xml looks like...
anyThingYouWant
requestHandler name=/update class=solr.XmlUpdateRequestHandler /
requestHandler name=/update/javabin
I've been trying to work out how SOLR thinks about dates internally so I
can boost newer documents. My post_date field is stored as seconds
since the epoch, so I think the following is probably what I want. I
used 3.17 instead of the 3.16 in all the examples because my own math
suggests
On 4/9/2010 7:35 PM, Lance Norskog wrote:
The example function seems to round time to years, so you're boosting by year?
Your dates are stored as UTC 64-bit longs counting the number of
milliseconds since Jan 1, 1970. That's it. They're in milliseconds
whether you supplied them that way or not.
I am using a setup where I have specified the shards parameter in a
broker called main, which then queries a bunch of other machines
including the one it's on, using the core named live.
requestHandler name=standard class=solr.SearchHandler default=true
lst name=defaults
bool
Adding it to the main core looks like it works, without the dismax
handler even present in the live core config. It won't take the bf
value that I described, though.
str name=bfrecip(ms(NOW,product(post_date,1000)),3.17e-11,1,1)/str
This spits an error:
Problem accessing /solr/main/select.
I've got a very simple perl script (most of the work is done with
modules) that I wrote which forks off multiple processes and throws
requests at Solr, then gives a little bit of statistical analysis at the
end. I have planned on sharing it from the beginning, I just have to
clean it up for
On 4/12/2010 8:51 AM, Paolo Castagna wrote:
There are already two related pages:
- http://wiki.apache.org/solr/SolrPerformanceFactors
- http://wiki.apache.org/solr/SolrPerformanceData
Why not to create a new page?
- http://wiki.apache.org/solr/BenchmarkingSolr (?)
Done. I hope you like
I am trying to boost relevancy based on a date field with dismax, and
I've included the requestHandler config below. The post_date field in
my database is simple UNIX time, seconds since epoch. It's in a MySQL
bigint field, so I've stored it as a tlong in Solr. This filed is
required by our
On 4/12/2010 11:55 AM, Shawn Heisey wrote:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NOW-2YEARS TO NOW-1YEARS]^2.0
[* TO NOW-2YEARS]^1.0
And here we have the perfect example of something I mentioned a while
ago - my Thunderbird (v3.0.4 on Win7) turning Solr boost syntax
On 4/14/2010 8:12 AM, Shawn Heisey wrote:
On 4/12/2010 9:29 PM, Lance Norskog wrote:
During indexing: the basic Solr XmlUpdateHandler does not have a
facility for this. In the DataImportHandler you can add Javascript
that takes your 'seconds since epoch', adds the delta between your
epoch and 1
On 4/12/2010 9:57 AM, Shawn Heisey wrote:
On 4/12/2010 8:51 AM, Paolo Castagna wrote:
There are already two related pages:
- http://wiki.apache.org/solr/SolrPerformanceFactors
- http://wiki.apache.org/solr/SolrPerformanceData
Why not to create a new page?
- http://wiki.apache.org/solr
. You
should not have to do any arithmetic or formatting of date strings.
This may need a few layers of SQL functions.
On 4/14/10, Shawn Heiseys...@elyograg.org wrote:
On 4/14/2010 8:12 AM, Shawn Heisey wrote:
On 4/12/2010 9:29 PM, Lance Norskog wrote:
During indexing: the basic
Is it possible to turn off request logging for some handlers?
Specifically, I'd like to stop logging requests to /admin/ping and
/replication, which get hit very often.
I looked around for an answer but wasn't able to find anything.
Thanks,
Shawn
On 4/15/2010 9:54 AM, Michael Kuhlmann wrote:
you can set logging for nearly every single task here:
http://host:port/solr/admin/logging.jsp
I'm pretty sure that refers to the output that normally goes to stderr,
I'm talking about the logs that go to files like
2010_04_15.request.log.
On 4/19/2010 11:09 AM, Lee Smith wrote:
http://localhost8983/solr/core1/select?shards=localhost:8983/solr/core2q=attr_content:test
Is this the correct way to query 2 cores at once ?
This should do what you want:
So, if I have my database multiply my value by 1000, I can put that
directly into a tdate field and it'll work as expected?
If that's the case, I think I might be able to modify my query from
SELECT * to SELECT *,post_date*1000 as pdate and add the pdate field
to the schema as type tdate.
I found what I believe is a better option even if the multiplication
would work - FROM_UNIXTIME. That returns the same kind of output as you
get from an actual database date field.
On 4/20/2010 12:07 PM, Shawn Heisey wrote:
So, if I have my database multiply my value by 1000, I can put
Michael,
The SolrEntityProcessor looks very intriguing, but it won't work with
the released 1.4 version. If that's OK with you and it looks like it'll
do what you want, feel free to ignore the rest of this.
I'm also using MySQL as an import source for Solr. I was unable to use
the
On 4/20/2010 9:09 PM, caman wrote:
Shawn,
Is this your custom implementation?
For a delta-import, minDid comes from
the maxDid value stored after the last successful import.
Are you updating the dataTable after the import was successful? How did you
handle this? I have similar scenario and
Is it possible to issue some kind of query to a Solr core that will
return the last time the index was optimized? Every day, one of my
shards should get optimized, so I would like my monitoring system to
tell me when the newest optimize date is more than 24 hours ago. I
could not find a way
On 4/21/2010 1:24 PM, Shawn Heisey wrote:
Is it possible to issue some kind of query to a Solr core that will
return the last time the index was optimized? Every day, one of my
shards should get optimized, so I would like my monitoring system to
tell me when the newest optimize date is more
Here's how I've got things set up. It's a different directory structure
than yous, and I run it under jetty, but hopefully it gives you the
basic idea. The dataDir setting is relative to the instanceDir
setting. I run jetty with -Dsolr.solr.home=/index/solr so it can find
solr.xml.
I would like to know the same thing. I'm using 5.1.12 myself. A full
reindex of one of my shards takes 4-6 hours for 7 million rows,
depending on whether I run them one at a time or all at once. If I run
the same query on the same machine with the commandline client and write
the results to
Lucas.. was there a reason you went with 5.1.10 or was it just the latest
when you started your Solr project?
just what was recent when i set things up.
Also, how many items are in your index and how big is your index size?
index size is 4.6GB with about 16M entities.
I
I am looking at SOLR-788, trying to apply it to latest trunk. It looks
like that's going to require some rework, because the included constant
PURPOSE_GET_MLT_RESULTS conflicts with something added later,
PURPOSE_GET_TERMS.
How hard would it be to rework this to apply correctly to trunk? Is
On 5/17/2010 2:40 PM, D C wrote:
We have a large index, separated
into multiple shards, that consists of records exported from a database. One
requirement is to support near real-time
synchronization with the database. To accomplish this we are considering
creating
a daily shard where create
On 5/17/2010 3:34 PM, Shawn Heisey wrote:
I am looking at SOLR-788, trying to apply it to latest trunk. It
looks like that's going to require some rework, because the included
constant PURPOSE_GET_MLT_RESULTS conflicts with something added later,
PURPOSE_GET_TERMS.
How hard would
On 5/14/2010 12:40 PM, Shawn Heisey wrote:
I downgraded to 5.0.8 for testing. Initially, I thought it was going
to be faster, but it slows down as it gets further into the index. It
now looks like it's probably going to take the same amount of time.
On the server timeout thing - that's
I am trying to do some denormalizing with DIH from a MySQL source.
Here's part of my data-config.xml:
entity name=dataTable pk=did
query=SELECT *,FROM_UNIXTIME(post_date) as pd FROM ncdat WHERE
did gt; ${dataimporter.request.minDid} AND did lt;=
${dataimporter.request.maxDid} AND (did
On 6/28/2010 3:28 PM, caman wrote:
In your query 'query=SELECT webtable as wt FROM ncdat_wt WHERE
featurecode='${ncdat.feature}' .. instead of ${ncdat.feature} use
${dataTable.feature} where dataTable is your parent entity name.
I knew it would be something stupid like that. I thought I
Is it possible for Solr (or Luke/Lucene) to tell me exactly how much of
the total index disk space is used by each field? It would also be very
nice to know, for each field, how much is used by the index and how much
is used for stored data.
Replication does not transfer files that already exist on the slave and
have the same metadata (size, last modified, etc) as the master. As far
as deleting files, it will only do so if they do not exist on the master.
In most cases, the only way that it would delete and copy the entire
index
It's possible to get near real-time adds and updates (every two
minutes in our case) with a multi-shard setup, if you have a shard
dedicated to new content and have the right combination of unique
identifiers on your data. I'll respond off-list with a full description
of my setup.
On
I've started a couple of previous threads on this topic, but I did not
have a good date field in my index to use at the time. I now have a
schema with the document's post_date in tdate format, so I would like to
actually do some implementation. Right now, we are not doing relevancy
ranking
One of the replies I got on a previous thread mentioned range queries,
with this example:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NOW-2YEARS TO NOW-1YEARS]^2.0
[* TO NOW-2YEARS]^1.0
Something like this seems more flexible, and into it, I read an
implication that the
I have finally figured out how to turn this off in Thunderbird 3:
Go to Tools, Options, Display, and turn off Display emoticons as
graphics.
On 4/12/2010 12:04 PM, Shawn Heisey wrote:
On 4/12/2010 11:55 AM, Shawn Heisey wrote:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NOW
On 7/17/2010 3:28 AM, marship wrote:
Hi. Peter and All.
I merged my indexes today. Now each index stores 10M document. Now I only have
10 solr cores.
And I used
java -Xmx1g -jar -server start.jar
to start the jetty server.
How big are the indexes on each of those cores? You can easily get
I'm developing a new schema that includes something similar. The DIH
database select statement uses a left join to gather a set of values for
each main record into a new field, separated by semicolons. I put the
result into a fieldType with the following analyzer chain, which breaks
it up
On 7/29/2010 12:18 PM, Chris Hostetter wrote:
it also depends on what you want to get *out* if this is a stored field
... using an analyzer like this will deal with letting you facet on the
individual terms, but the stored vaue returned with each document will
still be a single semi-colon
On 7/29/2010 1:13 PM, Chris Hostetter wrote:
: My initial approach was to grab the values (which are in another table) with a
: DIH subentity and store them in a multivalued field, but that reduced index
: speed to a crawl. That's because instead of one query for the entire import,
: it was
I tried some time ago to use SOLR-788. Ultimately I was able to get
both patch versions to apply (separately), but neither worked. The
suggestion I received when I commented on the issue was to download the
specific release mentioned in the patch and then update, but the patch
was created
On 8/11/2010 3:27 PM, JohnRodey wrote:
1) Is there any information on preferred maximum sizes for a single solr
index. I've read some people say 10 million, some say 80 million, etc...
Is there any official recommendation or has anyone experimented with large
datasets into the tens of
On 8/12/2010 8:32 PM, harrysmith wrote:
Win XP, Solr 1.4.1 out of the box install, using jetty. If I add greater than
or less than (ie or) in any xml field and attempt to load or run from
the DataImportConsole I receive a SAXParseException. Example follows:
If I don't have a 'less than' it
On 4/9/2010 7:35 PM, Lance Norskog wrote:
Function queries are notoriously slow. Another way to boost by year is
with range queries:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NOW-2YEARS TO NOW-1YEARS]^2.0
[* TO NOW-2YEARS]^1.0
Notice that you get to have a non-linear curve
I have had a request from our development team. I did some searching
and could not find an answer.
They want to sort by a date field but filter out all results below a
minimum relevancy score. Is this possible? I suspect that our only
option will be to do the search sorted by relevancy
Would I do separate bq values for each of the ranges, or is there a
way to include them all at once? If it's the latter, I'll need a full
example with a field name, because I'm clueless. :)
On 8/17/2010 2:29 PM, Lance Norskog wrote:
I think 'bq=' is what you want. In dismax the main query
, Shawn Heisey wrote:
Would I do separate bq values for each of the ranges, or is there a
way to include them all at once? If it's the latter, I'll need a full
example with a field name, because I'm clueless. :)
On 8/17/2010 2:29 PM, Lance Norskog wrote:
I think 'bq=' is what you want
Most of your time is spent doing the query itself, which in the light
of other information provided, does not surprise me. With 12GB of RAM
and 9GB dedicated to the java heap, the available RAM for disk caching
is pretty low, especially if Solr is actually using all 9GB.
Since your index is
I am just delving into the spellcheckcomponent on a test server
running a 3.1 build from June 29th. I have noticed that when you ask
for a rebuild of the spell check index, it deletes it before starting
the rebuild. It takes about 39 minutes to build one (3GB), which is a
long time to do
On 8/20/2010 8:56 PM, Lance Norskog wrote:
The first question is about your use cases. How many words are in the
eventual 3GB spelling index? Do you really need that many?
Spell-checking is a more controllable UI if you make it from a
dictionary.
It's built from an index-only field that
I have a field named keywords in my index. The schema browser page
is not able to deal with this, so I have trouble getting statistical
information on this field. When I click on the field, Firefox hangs for
a minute and then gives the unresponsive script warning. I assume
(without
On 8/23/2010 12:07 AM, Shawn Heisey wrote:
I have a field named keywords in my index. The schema browser page
is not able to deal with this, so I have trouble getting statistical
information on this field. When I click on the field, Firefox hangs
for a minute and then gives
Can I pass my data through WordDelimiterFilterFactory more than once?
It occurs to me that I might get better results if I can do some of the
filters separately and use preserveOriginal on some of them but not others.
Currently I am using the following definition on both indexing and
On 5/24/2010 6:30 AM, Sascha Szott wrote:
Hi folks,
is it possible to sort by field length without having to (redundantly)
save the length information in a seperate index field? At first, I
thought to accomplish this using a function query, but I couldn't find
an appropriate one.
I have
It's metadata for a collection of 45 million documents that is mostly
photos, with some videos and text. The data is imported from a MySQL
database and split among six large shards (each nearly 13GB) and a small
shard with data added in the last week, which usually works out to
between
It's metadata for a collection of 45 million documents that is mostly
photos, with some videos and text. The data is imported from a MySQL
database and split among six large shards (each nearly 13GB) and a small
shard with data added in the last week. That works out to between
300,000 and
On 8/28/2010 7:59 PM, Shawn Heisey wrote:
The only drop in term quality that I noticed was that possessive words
(apostrophe-s) no longer have the original preserved. I haven't yet
decided whether that's a problem.
I finally did notice another drop in term quality from the dual pass
Thank you for taking the time to help. The way I've got the word
delimiter index filter set up with only one pass, wolf-biederman will
result in wolf, biederman, wolfbiederman, and wolf-biederman. With two
passes, the last one is not present. One pass changes gremlin's to
gremlin and
On 8/29/2010 2:17 PM, Erick Erickson wrote:
charFilters are applied even before the tokenizer
Try putting this after any instances of, say, WhiteSpaceTokenizerFactory
in your analyzser definition, and I believe you'll see that this is not
true.
At least looking at this in the analysis page from
On 8/30/2010 9:01 AM, Shawn Heisey wrote:
On 8/29/2010 2:17 PM, Erick Erickson wrote:
charFilters are applied even before the tokenizer
Try putting this after any instances of, say, WhiteSpaceTokenizerFactory
in your analyzser definition, and I believe you'll see that this is not
true
I am trying to use PatternReplaceCharFilterFactory (SOLR-1653) to
strip leading and trailing punctuation from terms. It's not working.
This was previously discussed here as part of something I was trying
with WordDelimiterFilterFactory, but I think it needs its own thread now.
I seem to be
HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed mode)
On 8/31/2010 8:23 AM, Shawn Heisey wrote:
I am trying to use PatternReplaceCharFilterFactory (SOLR-1653) to
strip leading and trailing punctuation from terms. It's not working.
This was previously discussed here as part of something I
now. This filter is not mentioned on the
wiki page dealing with analyzers, which is why I did not use it from the
start. When I searched that page for regex, the CharFilter was the only
one that came up.
On 8/31/2010 8:29 AM, Shawn Heisey wrote:
I didn't give any particulars about my setup
On 8/31/2010 8:49 AM, Shawn Heisey wrote:
I believe I may have solved this. After a more careful reading of
SOLR-1653, I noticed that they referred to another filter. I changed
my configuration from /solr/.PatternReplaceCharFilterFactory to
/solr/.PatternReplaceFilterFactory and updated
On 8/26/2010 5:04 PM, Chris Hostetter wrote:
doubtful.
I suspect it has more to do with the amount of data in your keywords
field and the underlying request to hte LukeRequestHandler timing out.
have you tried using it with a test index where the keywords
field has only a few words in it?
On 9/2/2010 2:54 AM, Toke Eskildsen wrote:
We've done a fair amount of experimentation in this area (1997-era SSDs
vs. two 15.000 RPM harddisks in RAID 1 vs. two 10.000 RPM harddisks in
RAID 0). The harddisk setups never stood a chance for searching. With
current SSD's being faster than
On 9/2/2010 9:31 AM, Mark wrote:
Thanks for the suggestions. Our slaves have 12G with 10G dedicated to
the JVM.. too much?
Are the rysnc snappuller featurs still available in 1.4.1? I may try
that to see if helps. Configuration of the switches may also be possible.
Also, would you mind
On 9/3/2010 3:39 AM, Toke Eskildsen wrote:
I'll have to extrapolate a lot here (also known as guessing).
You don't mention what kind of harddrives you're using, so let's say
15.000 RPM to err on the high-end side. Compared to the 2 drives @
15.000 RPM in RAID 1 we've experimented with, the
On 9/3/2010 12:37 PM, Jonathan Rochkind wrote:
Is the OS disk cache something you configure, or something the OS just does
automatically based on available free RAM? Or does it depend on the exact OS?
Thinking about the OS disk cache is new to me. Thanks for any tips.
Depends on what you
I find myself in need of the ability to access one field by more than
one name, for application transition purposes. Right now we have a
field (ft_text, by far the largest part of the index) that is indexed
but not stored. This field and three others are copied into an
additional field
On 9/8/2010 4:32 PM, David Yang wrote:
I have a table that I want to index, and the table has no datetime
stamp. However, the table is append only so the primary key can only go
up. Is it possible to store the last primary key, and use some delta
query=select id where id${last_id_value}
I
On 9/9/2010 1:23 PM, Vladimir Sutskever wrote:
Shawn,
Can you provide a sample of passing the parameter via URL? And how using it
would look in the data-config.xml
Here's the URL that I send to do a full build on my last shard:
On 9/9/2010 5:38 PM, Erick Erickson wrote:
Could you give us an idea of why you think it isn't present? As far as I can
tell,
it's been around for a while. Are you getting an error and if so, can you
show it
to us?
Look in schema.xml of what you downloaded (probably in the example
directory).
I use the PingRequestHandler option that tells my load balancer
whether a machine is available.
When the service is disabled, every one of those requests, which my load
balancer makes every five seconds, results in the following in the log:
Sep 9, 2010 6:06:58 PM
On 9/15/2010 10:50 AM, Shashi Kant wrote:
Shawn, I have done some research into this, machine-vision especially
on a large scale is a hard problem, not to be entered into lightly. I
would recommend starting with OpenCV - a comprehensive toolkit for
extracting various features such as Color,
On 9/16/2010 7:45 AM, Shashi Kant wrote:
Lire is a nascent effort and based on a cursory overview a while back,
IMHO was an over-simplified version of what a CBIR engine should be.
They use CEDD (color edge descriptors).
Wouldn't work for the kind of applications I am working on - which
needs
On 9/16/2010 12:27 PM, Dennis Gearon wrote:
Is a core a running piece of software, or just an index/config pairing?
Dennis Gearon
A core is one complete index within a Solr instance.
http://wiki.apache.org/solr/CoreAdmin
My master index servers have five cores - ncmain, ncrss, live, build,
On 9/17/2010 3:01 AM, Paul Dhaliwal wrote:
Another feature missing in DIH is ability to pass parameters into your
queries. If one could pass a named or positional parameter for an entity
query, it will give them lot of freedom to optimize their delta or full load
queries. One can even get
On 9/17/2010 7:22 PM, Chris Hostetter wrote:
a) not really. assuming you have no problem modifying the indexing code
in the way you want, and are primarily worried about searching from
various clients, then the most straight forward approach is probably to
use RewriteRules (or something
On 9/22/2010 1:39 AM, Shashikant Kore wrote:
Hi,
I'm using DIH to index records from a database. After every update on
(MySQL) DB, Solr DIH is invoked for delta import. In my tests, I have
observed that if db updates and DIH import is happening concurrently, import
misses few records.
Here
You could get it from Solr, yes. That didn't even occur to me because
when I was designing my scripts, I didn't yet have a fully integrated
Solr index. :) With hindsight, I still wouldn't get it from Solr. I
would lose some flexibility and ease of administration.
It's certainly possible
Assuming you are on a unix variant with a working lsof, use this. This
probably won't work correctly on Solaris 10:
lsof -nPi | grep 8983
lsof -nPi | grep 8080
On Windows, you can do this in a command prompt. It requires elevation
on Vista or later. The -b option was added in WinXP SP2 and
I would like to be able to do a delta import on arbitrary data, not a
last modified date. Specifically, our database has an auto_increment
field called DID, or document identifier. For changes to existing data.
this field is updated anytime a row is changed in any way, effectively
turning it
We are currently using a commerical indexing product based on Lucene for
our indexing needs, and would like to replace it with SOLR. The source
database for this system has 40 million records, growing by about 30,000
items per day. It is a repository for all the metadata relating to an
At the 9+ hour mark, is your database server showing active connections
that are sending data, or is all the activity local to SOLR?
We have a 40 million row database in MySQL, with each row comprising
more than 80 fields. I'm including the config from one of our shards.
There are about 6.6
parameter is larger than physical memory. If this is
happening, you'd definitely see constant hard drive light blinking.
On 3/6/2010 10:20 AM, Shawn Heisey wrote:
At the 9+ hour mark, is your database server showing active
connections that are sending data, or is all the activity local to SOLR
Do keep looking into the batchSize, but I think I might have found the
issue. If I understand things correctly, you will need to add
processor=CachedSqlEntityProcessor to your first entity. It's only
specified on the other two. Assuming you have enough RAM and heap space
available in your
What database are you using? Many of the JDBC drivers try to pull the
entire resultset into RAM before feeding it to the application that
requested the data. If it's MySQL, I can show you how to fix it. The
batchSize parameter below tells it to stream the data rather than buffer
it. With
I guess I must be including too much information in my questions,
running into tl;dr with them. Later today when I have more time I'll
try to make it more bite-size.
On 3/9/2010 2:28 PM, Shawn Heisey wrote:
I attended the Webinar on March 4th. Many thanks to Yonik for putting
Does SolrCloud's notion of a collection, which appears to use cores,
override normal multi-core usage for building an offline index and
quickly swapping it into production? Some of the features in SolrCloud
look useful, if it's still possible to exert manual control over cores
and shards.
Disclaimer: My Oracle experience is miniscule at best. I am also a
beginner at Solr, so grab yourself the proverbial grain of salt.
I googled a bit on CLOB. One page I found mentioned setting up a view
to return the data type you want. Can you use the functions described
on these pages in
Below is my data-config.xml file, which I am using to build an index for
my first shard. I have a couple of questions.
Can Solr include the hostname (short version) it's running on in the
query? Alternatively, is there a way to override the query with a URL
parameter before or when doing
That looks very useful. So does this mean that this will work?
URL text:
?command=full-importnumShards=6modValue=0minDid=229615984
XML:
query=SELECT * FROM [table] WHERE (did %
${dataimporter.request.numShards}) = ${dataimporter.request.modValue}
AND ${dataimporter.request.minDid} = did
1 - 100 of 5391 matches
Mail list logo