Hi,
The problem comes form PDFBox
(http://brutus.apache.org/jira/browse/PDFBOX-377) and is fixed now.
However Tika doesn't yet use this version of PDFBox.
So for PDF text extraction, I doesn't use Tika but pdftotext.
Dominique
Le 09/03/10 06:00, Robert Muir a écrit :
it is an optional
Hi,
Yes, I agree it is not an easy issue. Index all languages with the
appropriate char filter, tokenizer and filters for each language is not
possible without new text type and new analyzer development.
If you plan to index up to 10 different languages, I suggest one text
field per
Hi,
I developed a custom analyzer. This analyzer needs to be polymorphous
according to the first 4 characters of the text to be analyzed. In order
to do this I implement my own ReuseStratgy class (NoReuseStrategy) and
in the constructor, I do this super(new NoReuseStrategy());
At Lucene
Hi,
In order to crawl and index your web sites, may you can have a look at
www.crawl-anywhere.com. It includes a web crawler, a document processing
pipeline and a solr indexer.
Dominique
Le 23/12/10 16:27, Dietrich a écrit :
I want to use Solr to index two types of documents:
- local
Hi,
I would not try to change the lucene version in Solr 1.4.1 from 2.9.x to
3.0.x.
As said Koji, the best solution is to get the branch 3.x or the trunk
and build it. You need svn and ant.
1. Create a working directory
$ mkdir ~/solr
2. Get the source
$ cd ~/solr
$ svn co
Hi,
I would like to announce Crawl Anywhere. Crawl-Anywhere is a Java Web
Crawler. It includes :
* a crawler
* a document processing pipeline
* a solr indexer
The crawler has a web administration in order to manage web sites to be
crawled. Each web site crawl is configured with a
Dominique
Le 02/03/11 09:36, Rosa (Anuncios) a écrit :
Nice job!
It would be good to be able to extract specific data from a given page
via XPATH though.
Regards,
Le 02/03/2011 01:25, Dominique Bejean a écrit :
Hi,
I would like to announce Crawl Anywhere. Crawl-Anywhere is a Java Web
Java into search field, it returned a
lot of references to coldfusion error pages. May be a recrawl would help?
On Wed, Mar 2, 2011 at 1:25 AM, Dominique Bejean
dominique.bej...@eolya.fr mailto:dominique.bej...@eolya.fr wrote:
Hi,
I would like to announce Crawl Anywhere. Crawl
Aditya,
The crawler is not open source and won't be in the next future. Anyway,
I have to change the license because it can be use for any personal or
commercial projects.
Sincerely,
Dominique
Le 02/03/11 10:02, findbestopensource a écrit :
Hello Dominique Bejean,
Good job.
We
looking for crawlers that incoporate this technology
but without success.
Any plans on incorporating this?
Cheers,
Geert-Jan
2011/3/2 Dominique Bejean dominique.bej...@eolya.fr
mailto:dominique.bej...@eolya.fr
Rosa,
In the pipeline, there is a stage that extract the text from
and that is posing challenges with Nutch?
-Original Message-
From: Dominique Bejean [mailto:dominique.bej...@eolya.fr]
Sent: Wednesday, March 02, 2011 6:22 AM
To: solr-user@lucene.apache.org
Subject: Re: [ANNOUNCE] Web Crawler
Aditya,
The crawler is not open source and won't
Hi,
In Solr 3.x the parameter abortOnConfigurationError=false allows cores
continue to work even if an other core fails due to a configuration error.
This parameter doesn't exist anymore in Solr 4.0 but afetr some tests,
it looks like cores are isolated from each other. By isolated, I mean
Hi,
I think the response is yes, but I need to check.
Is it possible to upgrade from solr 3.4 to solr 3.6.1 without rebuilding
the existing index ?
Thank you.
Dominique
, Dominique Bejean
dominique.bej...@eolya.fr wrote:
Hi,
I think the response is yes, but I need to check.
Is it possible to upgrade from solr 3.4 to solr 3.6.1 without rebuilding the
existing index ?
Thank you.
Dominique
Hi,
I wrote a custom fieldtype that need to read a configuration file in the
conf directory of the core and also get the absolute path of the conf
directory
In solr 4 alpha, my code was something like :
import org.apache.solr.core.SolrResourceLoader;
...
public class MultilingualField
();
if (path.lastIndexOf(f.separatorChar)!=-1) {
return path.substring(0,
path.lastIndexOf(f.separatorChar));
}
return null;
}
return null;
}
Not sure it is the best way, but it works :)
Dominique
Le 27/08/12 23:40, Dominique Bejean a écrit
May be you can take a look at Crawl-Anywhere which have administration
web interface, solr indexer and search web application.
www.crawl-anywhere.com
Regards.
Dominique
Le 05/09/12 17:05, Lochschmied, Alexander a écrit :
This may be a bit off topic: How do you index an existing website and
Hi,
Are you using a correct stopword file for the French language ? It is
very importante in order the the MLT component works fine.
You should also take a look at this document.
http://cephas.net/blog/2008/03/30/how-morelikethis-works-in-lucene/
MLT support in SolrJ is a an old story. May be
Hi,
I didn't see this question.
Yes, I confirm Crawl-Anywhere can crawl in distributed environment.
If you have several huge web sites to crawl, you can dispatch crawling
across several crawler engines. However, one single web site can only be
crawled by one crawler engine at a time.
This
Hi,
Crawl-Anywhere is now open-source - https://github.com/bejean/crawl-anywhere
Best regards.
Le 02/03/11 10:02, findbestopensource a écrit :
Hello Dominique Bejean,
Good job.
We identified almost 8 open source web crawlers
http://www.findbestopensource.com/tagged/webcrawler I don't
Hi,
I did see this message (again). Please, use the new dedicated
Crawl-Anywhere forum for your next questions.
https://groups.google.com/forum/#!forum/crawl-anywhere
Did you solve your problem ?
Thank you
Dominique
Le 29/01/13 09:28, SivaKarthik a écrit :
Hi,
i resolved the issue
Hi,
Crawl-Anywhere includes a customizable document processing pipeline.
Crawl-Anywhere can also cache original crawled pages and documents in a
mongodb database.
Best regards.
Dominique
Le 11/02/13 06:16, SivaKarthik a écrit :
Dear Erick,
Thanks for ur relpy..
ya..nutch can meet
of these required software ?
Is there updated installation guide available ?
Thanks
Rajesh
On Wed, May 22, 2013 at 6:48 PM, Dominique Bejean
dominique.bej...@eolya.fr mailto:dominique.bej...@eolya.fr wrote:
Hi,
Crawl-Anywhere is now open-source -
https://github.com/bejean/crawl-anywhere
With 6 zookeeper instances you need at least 4 instances running at the same
time. How can you decide to stop 4 instances and have only 2 instances running
? Zookeeper can't work anymore in these conditions.
Dominique
Le 25 juil. 2013 à 00:16, Joshi, Shital shital.jo...@gs.com a écrit :
We
Hi,
Up to now, the best solution I found in order to implement a multi-words
suggester was to use ShingleFilterFactory filter at index time and the
termsComponent. At index time the analyzer was :
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
send something to the suggester you send just
eco or éco you fold them to eco too and get back these tokens.
Then the app layer breaks them up and displays them pleasingly.
Best
Erick
On Tue, Oct 1, 2013 at 5:45 PM, Dominique Bejean
dominique.bej...@eolya.fr wrote:
Hi,
Up to now, the best
Hi,
I agree Erick it could be a good think to have more details about your
configuration and collection.
Your heap size is 32Gb. How many RAM on each servers ?
By « 4 shard Solr cluster », you mean a 4 nodes Solr servers or a
collection with 4 shards ?
So, how many nodes in the cluster ?
How
Hi,
I use to put all dependency jar files (dih, adbc driver, …) in a lib
directory in the solr home directory where your shard are created.
something like this
solr/
solr.xml
cloudcollection1_shard2_replica2/
lib/
In solrconfig.xml, I remove all the lib … directives except this
in solrconfig.xml ?
Filter cache, query result cache and document cache are enabled.
Auto-warming is also done.
Can you provide all other JVM parameters ?
-Xms20g -Xmx24g -XX:+UseConcMarkSweepGC
Thanks again,
Modassar
On Wed, Dec 24, 2014 at 2:29 AM, Dominique Bejean
dominique.bej...@eolya.fr
wrote
And you didn’t give how many RAM on each servers ?
2014-12-24 8:17 GMT+01:00 Dominique Bejean dominique.bej...@eolya.fr:
Modassar,
How many items in the collection ?
I mean how many documents per collection ? 1 million, 10 millions, …?
How are configured cache in solrconfig.xml ?
What
for core inytapdf0
Philippe
- Mail original -
De: Dominique Bejean dominique.bej...@eolya.fr
À: solr-user@lucene.apache.org
Envoyé: Jeudi 15 Janvier 2015 11:46:43
Objet: Re: Core deletion
Hi,
Is there something in solr logs at startup that can explain the deletion ?
How were
Hi,
Is there something in solr logs at startup that can explain the deletion ?
How were created the cores ? using cores API ?
Dominique
http://www.eolya.fr
2015-01-14 17:43 GMT+01:00 phi...@free.fr:
Hello,
I am running SOLR 4.10.0 on Tomcat 8.
The solr.xml file in
One of our customers needs to index 15 billions document in a collection.
As this volume is not usual for me, I need some advices about solrcloud
sizing (how much servers, nodes, shards, replicas, memory, ...)
Some inputs :
- Collection size : 15 billions document
- Collection update : 8
, Feb 17, 2015 at 4:40 PM, Dominique Bejean
dominique.bej...@eolya.fr wrote:
One of our customers needs to index 15 billions document in a collection.
As this volume is not usual for me, I need some advices about solrcloud
sizing (how much servers, nodes, shards, replicas, memory
, last week, week before, ...)
Regards
Dominique
2015-02-18 10:35 GMT+01:00 Toke Eskildsen t...@statsbiblioteket.dk:
On Wed, 2015-02-18 at 01:40 +0100, Dominique Bejean wrote:
(I reordered the requirements)
- Collection size : 15 billions document
- Document size is nearly 300 bytes
- 1
Hi,
As Shawn said, install enough memory in order that all free direct memory
(non heap memory) be used as disk cache.
Use 40% maximum of the available memory for heap memory (Xmx JVM
parameter), but never more than 32 Gb
And avoid your server to swap.
For most Linux systems, this is configured
Hi,
When you say I renamed some cores, cleaned other unused ones that we don't
need anymore etc, how did you do this ?
With Cores or Collections API or by deleting core's directories in Solr
Home ?
Dominique
http://www.eolya.fr
2015-02-18 17:04 GMT+01:00 Abdelali AHBIB alifar...@gmail.com:
Hi,
I never used map-reduce indexing.
My understanding is that map-reduce tasks generate one or more Solr
indices, then the golive tool is used in order to merge these indices at
core level to one or more shards (the shard's leaders) in a Solrcloud
collection. After merge occurs in leaders the
Hi,
In release 4.10.3, the following lines were removed from solr starting
script (bin/solr)
# TODO: see SOLR-3619, need to support server or example
# depending on the version of Solr
if [ -e $SOLR_TIP/server/start.jar ]; then
DEFAULT_SERVER_DIR=$SOLR_TIP/server
else
Hi,
You can have a look at www.crawl-anywhere.com
A web crawler on top of Solr. Used for following vertical search engines :
http://www.hurisearch.org/
http://www.searchamnesty.org/
Regards
Dominique
2015-01-06 15:22 GMT+01:00 Ahmet Arslan iori...@yahoo.com.invalid:
Hi,
and is still work in progress
but should give you more information.
Hope that helps.
On Tue, Jan 6, 2015 at 1:29 AM, Dominique Bejean
dominique.bej...@eolya.fr javascript:;
wrote:
Hi,
In release 4.10.3, the following lines were removed from solr starting
script (bin/solr)
# TODO
Thank you for the response
This is something Heliosearch can do. Ionic Seeley, created a JIRA ticket
to back port this feature to Solr 5.
https://issues.apache.org/jira/browse/SOLR-7214
But in order to be available in Solr 5 this ticket should cover both
http://heliosearch.org/json-facet-api/
Hi,
Here is a query with a sample result set.
http://localhost:8983/solr/myindex/select?q=*%3A*wt=jsonindent=truestats=truestats.field={!tag=piv1}sizefacet=truefacet.limit=10facet.pivot={!stats=piv1}objectrows=0
facet_counts:{
facet_queries:{},
facet_fields:{},
facet_dates:{},
Hi,
Is it normal with Solr 4.10.3 that the data directory of replicas still
contains directories like
index.3636365667474747
index.999080980976
and files
index.properties
replica.properties
If yes, why and in which circumstances ?
Regards
Dominique
Hi,
I try to adapt Mark Miller's solr-map-reduce-example scripts in order to
try to use MapReduceIndexerTool with Solr 5.0.0 and Hadoop 2.6.0.
I use the same twitter sample data with the same avro configuration, ...
I had to change the set-map-reduce-classpath.sh file provided with Solr 5
under
commits (after DIH
import) and when nodes restart.
So, I will have more precise log messages tomorrow.
Thank you for your response.
Dominique
2015-04-01 18:29 GMT+02:00 Shawn Heisey apa...@elyograg.org:
On 4/1/2015 6:35 AM, Dominique Bejean wrote:
Is it normal with Solr 4.10.3 that the data
Hi,
The wiki explains how to upload the security.json file to Zk (
https://cwiki.apache.org/confluence/display/solr/Authentication+and+Authorization+Plugins
).
However, is it possible to use authentication and authorization plugin in a
not SolrCloud environment ? If yes, where has to be located
Hi,
Is there a way to list all the current solr settings ? Something similar to
the MySQL « show variables » command ?
For instance, if I configure the « transientCacheSize » parameter in
solr.xml file, how to be sure this setting was took into account ?
Regards
Dominique
Hi,
Is the SnowballPorterFilter sensitive to the accents for French for
instance ?
If I use both SnowballPorterFilter and ASCIIFoldingFilter, do I have to
configure ASCIIFoldingFilter after SnowballPorterFilter ?
Regards.
Dominique
--
Dominique Béjean
06 08 46 12 43
06457315001053
>
> Ahmet
>
>
>
> On Friday, February 10, 2017 11:27 AM, Dominique Bejean <
> dominique.bej...@eolya.fr> wrote:
> Hi,
>
> Is the SnowballPorterFilter sensitive to the accents for French for
> instance ?
>
> If I use both SnowballPorterFilte
is?
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Tue, Apr 18, 2017 at 6:33 AM, Dominique Bejean <
> dominique.bej...@eolya.fr
> > wrote:
>
> > Hi,
> >
> > I don not understand what I am doing wrong il this
"id": "book2",
"id_book_s": "book2",
"review_dt": "1994-03-15T12:00:00Z"
},
{
"title_s": "Friends",
"pubyear_i": 1994,
"stars_i": 4,
collection).
Dominique
Le mar. 18 avr. 2017 à 15:28, Dominique Bejean <dominique.bej...@eolya.fr>
a écrit :
> Hi,
>
> I reply to myself
>
> I just had to invert the "on" clause to make it work
>
> curl --data-urlencode 'expr=innerJoin(
Hi,
I don not understand what I am doing wrong il this simple query.
curl --data-urlencode 'expr=innerJoin(
search(books,
q="*:*",
fl="id",
sort="id asc"),
10,000 then that return
> packet is obviously 1,000 times as large and must be assembled in
> memory.
>
> I rather doubt the phonetic filter is to blame. But you can test this
> by just omitting the field containing the phonetic filter in the
> search query. I've certainly been wrong befor
Thank you Shaw for replying each items
I start to figure out better all these tricky jvm stuff.
Dominique
Le dim. 3 déc. 2017 à 01:30, Shawn Heisey <apa...@elyograg.org> a écrit :
> On 12/2/2017 8:43 AM, Dominique Bejean wrote:
> > I would like to have some advices on best pr
Hi,
I would like to have some advices on best practices related to Heap Size,
MMap, direct memory, GC algorithm and OS Swap.
This is a waste subject and sorry for this long question but all these
items are linked in order to have a stable Solr environment.
My understanding and questions.
About
:NON^203+size_facet_boost_exact:"velo"^299+size_facet_boost:velo^296+size_facet_relative_boost:velo^292+marque_boost_exact:"velo"^359+marque_boost:velo^356+marque_relative_boost:velo^352+=velo=200=velo=edismax=textSearch=true=1=true=json=EUR_0_price_decimal=sort_EUR_0_special_p
; This has been solid in production with a 32 node Solr Cloud cluster. We do
> not do faceting.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/ (my blog)
>
>
> > On Dec 2, 2017, at 7:43 AM, Dominique Bejean <dominique.bej...@eol
Hi,
We are encountering issue with GC.
Randomly nearly once a day there are consecutive full GC with no memory
reclaimed.
So the old generation heap usage grow up to the limit.
Solr stop responding and we need to force restart.
We are using Solr 6.6.1 with Oracle 1.8 JVM. The JVM settings
ssure by roughly the same amount so it's a win in this
> situation.
>
> Have you attached a memory profiler to the running Solr instance? I'd
> be curious where the memory is being allocated.
>
> Best,
> Erick
>
> On Fri, Dec 1, 2017 at 8:31 AM, Toke Eskildsen <t..
Hi,
Which version of Solr are you using ?
Regards
Dominique
Le ven. 4 mai 2018 à 09:13, Bernd Fehling
a écrit :
> Hi list,
>
> this sounds simple but I can't disable PrintGCTimeStamps in solr_gc
> logging.
> I tried with GC_LOG_OPTS in start scripts and
Hi,
On a node, I accidentally changed the SOLR_HOST value from uppercase to
lowercase and I restarted the node. After I fixed the error, I restarted
again the node but the node name in lowercase is still visible as "gone".
How to definitively remove a gone node from the Solrcloud graph ?
to ZK
# server/scripts/cloud-scripts/zkcli.sh -z "xxx.xxx.xxx.xxx:2181" -cmd
putfile /collections/xx/state.json /tmp/-state-local.json
- Start all Solr nodes
Dominique
Le mar. 29 mai 2018 à 14:19, Dominique Bejean a
écrit :
> Hi,
>
> On a node, I accide
Hi,
I am trying multi words query time synonyms with Solr 6.6.2and
SynonymGraphFilterFactory filter as explain in this article
https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/
My field type is :
text_gp:maillot) (((+name_text_gp:olympiqu +name_text_gp:de
> +name_text_gp:marseil) name_text_gp:om)))
>
> (btw my stop list only has “de” on it)
>
> Thanks,
>
> --
> Steve
> www.lucidworks.com
>
> > On Feb 10, 2018, at 2:12 AM, Dominique Bejean <domin
t;
"parsedquery_toString":"+(((name_text_gp:maillot) ((name_text_gp:om
(+name_text_gp:olympiqu +name_text_gp:marseil~1)",
The query result are the same for all queries.
It looks like this could be an acceptable workaround.
Thank you
Dominique
Le dim. 11 févr. 2018 à 10:31, Dominiqu
)
olympiqu om marseil maillot
So, i suspect an issue with edismax query parser.
Regards.
Dominique
Le ven. 9 févr. 2018 à 18:25, Dominique Bejean <dominique.bej...@eolya.fr>
a écrit :
> Hi,
>
> I am trying multi words query time synonyms with Solr 6.6.2and
> SynonymGraphFi
Hi,
We also experimenting time-out issues from time to time.
I sent this message one month ago, by mistake in the dev list.
Why use hardcoded values just in ZkClientClusterStateProvider.java file
while there are existing parameters for these time-out ?
Regards
Dominique
Hi,
Use Grafana with Solr starting version 7 si very easy and well documented.
https://lucene.apache.org/solr/guide/7_3/monitoring-solr-with-prometheus-and-grafana.html
Dominique
Le lun. 16 juil. 2018 à 06:56, Aroop Ganguly
a écrit :
> How do you use Grafana with Solr ? Did you build a http
Hi,
We are experimenting an issue related to Zk Timeout
Stacktrace is :
ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:67 -
Erreur dans l'attente de la fin de l'exécution d'un thread
ERROR 19 juin 2018 06:24:07,152 - h.concurrent.ConcurrentService:68 -
”.
Regards
Dominique
Le ven. 9 mars 2018 à 00:40, Shawn Heisey <apa...@elyograg.org> a écrit :
> On 3/8/2018 2:55 PM, Dominique Bejean wrote:
> > Disk I/O are critical for high performance Solrcloud.
>
> This statement has truth to it, but if your system is correctly size
Hi,
Disk I/O are critical for high performance Solrcloud.
I am looking for relevante disk I/O tests for both Solr node or Zookeeper
element and with these tests what are bad, correct or good results.
For instance how to know if these results with basic dd utility reports
correct disk
HI,
In the Solr Admin console, you can access for each core to the "Segment
info" page. You can see if there are more deleted documents in segments on
server X.
Dominique
Le lun. 8 oct. 2018 à 07:29, SOLR4189 a écrit :
> About which details do you ask? Yesterday we restarted all our solr
>
Hi,
1/
As previously said by other persons, my first action would be to understand
why you need so much heap ?
The first step is to maximize your heap size to 31Gb (or obviously less if
possible).
https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/
Can you
Hi,
What about cores segment details in admin UI interface ? More deleted
documents ?
Regards
Dominique
Le dim. 7 oct. 2018 à 08:22, SOLR4189 a écrit :
> Hi all,
>
> We use SOLR-6.5.1 and we have very strange issue. In our collection index
> size is very different from server to server
in Solr standalone mode, only authentication is fully fonctionnal, not
authorization !
Regards.
Dominique
Le dim. 30 déc. 2018 à 13:40, Dominique Bejean
a écrit :
> Hi,
>
> After reading more carefully the log file, here is my understanding.
>
> The request
>
> http://2:
?
Regards
Dominique
Le ven. 21 déc. 2018 à 10:46, Dominique Bejean
a écrit :
> Hi,
>
> I am trying to configure security.json file, in order to define the
> following users and permissions :
>
>- user "admin" with all permissions on all collections
>- u
Hi,
I created a Jira issue
https://issues.apache.org/jira/browse/SOLR-13097
Regards.
Dominique
Le lun. 31 déc. 2018 à 11:26, Dominique Bejean
a écrit :
> Hi,
>
> In debugging mode, I discovered that only in SolrCloud mode the collection
> name is extract from the request path
Hi,
I am trying to configure security.json file, in order to define the
following users and permissions :
- user "admin" with all permissions on all collections
- user "read" with read permissions on all collections
- user "1" with only read permissions on biblio collection
-
Hi,
This is a Solr side issue not a Zookeeper side issue.
Zookeeper 3.4.13 is 5 monthes old version so you can use it on server side
with the zookeeper client 3.4.11 provided by Solr.
Dominique
Le jeu. 20 déc. 2018 à 01:53, Yasufumi Mizoguchi a
écrit :
> Hi,
>
> I searched JIRA and found
Hi,
What is the scenario ? High query activity ? High update activity ?
Regards.
Dominique
Le mer. 19 déc. 2018 à 13:44, AshB a écrit :
> Hi,
>
> We are facing issue with solr/zookeeper where zookeeper timeouts after
> 1ms. Error below.
>
> *SolrException:
Hi,
There are the powerfull JMeter obviously and also SolrMeter (
https://github.com/tflobbe/solrmeter).
Regards
Dominique
Le jeu. 20 déc. 2018 à 03:17, zhenyuan wei a écrit :
> Hi all,
>Is there a common tool for SOLR benckmark? YCSB is not very
> suitable for SOLR. Currently, Is
Hi,
Are you aware about issues in Java applications in Docker if java version
is not 10 ?
https://blog.docker.com/2018/04/improved-docker-container-integration-with-java-10/
Regards.
Dominique
Le mer. 12 sept. 2018 à 05:42, Shawn Heisey a écrit :
> On 9/11/2018 9:20 PM, solrnoobie wrote:
>
Hi,
I don’t find any documentation about the parameter zookeeper_server_java_heaps
in zoo.cfg.
The way to control java heap size is either the java.env file of the
zookeeper-env.sh file. In zookeeper-env.sh
SERVER_JVMFLAGS="-Xmx=512m"
How many RAM on your server ?
Regards
Dominique
Le lun.
Hi,
We have a date field with default set to “now”. For this field, some
documents of the collection don’t have the same value in all replicas. The
difference can be 3 or 4 minutes !
The collection has 1 shard and 2 NRT replicas. Solr version is 7.5.
Collection is populated with DIH.
Any ideas
SynonymMaps.
> >>>>
> >>>> Regards
> >>>> Bernd
> >>>>
> >>>>
> >>>> Am 30.09.19 um 08:41 schrieb Andrea Gazzarini:
> >>>>> Hi,
> >>>>> looking at the stateful nature of
Hi,
I don't find explanations on what are the 2 numeric values mean at the end
of these log lines.
Regards.
Dominique
2019-09-30 09:19:17.474 INFO (qtp2051853139-9577) [c:maCollection3s3r
s:shard1 r:core_node11 x:maCollection3s3r_shard1_replica_t2]
o.a.s.u.p.LogUpdateProcessorFactory
Hi,
My concern is about memory used by synonym filter, especially if synonyms
resources files are large.
If in my schema, there are two field types "TypeSyno1" and "TypeSyno2"
using synonym filter with the same synonyms files.
For each of these two field types there are two fields
Field1 type
s replicating changed segments and that’s slowing down
> ingestion?
>
> It’d be interesting to index to NRT, leader-only and also a single TLOG
> collection.
>
>
> Best,
> Erick
>
> > On Oct 25, 2019, at 8:28 AM, Dominique Bejean
> wrote:
> >
> > Shawn
thout _either_ reading or writing to ZK.
>
> One rather obscure cause for ZK writes is when using “schemaless” mode.
> When a new field is detected, the schema (and thus the collection’s
> configuration) is changed, which generates writes..
>
> Best,
> Erick
>
>
> > On
Hi,
I would like to be certain to understand how Solr use Zookeeper and more
precisely when Solr write into Zookeeper.
Solr stores various informations in ZK
- globale configuration (autoscaling, security.json)
- collection configuration (configs)
- collections state (state.json,
g or writing to ZK.
>
> One rather obscure cause for ZK writes is when using “schemaless” mode.
> When a new field is detected, the schema (and thus the collection’s
> configuration) is changed, which generates writes..
>
> Best,
> Erick
>
>
> > On Nov 15, 2019, at 12
Hi Paresh,
Due to deleteDocByQuery impact on commits and searcher reopen, if a lot of
deletions are done it is preferable when possible to use deletebyid .
Regards
Dominique
Le mar. 12 nov. 2019 à 07:03, Paresh a écrit :
> Hi Erik,
>
> I am also looking for some example of deleteDocByQuery.
Thank you Shawn.
You're right !
It is better to read the good version of the Collection API documentation.
Le mar. 10 déc. 2019 à 19:49, Shawn Heisey a écrit :
> On 12/10/2019 11:25 AM, Dominique Bejean wrote:
> > I would like to convert a collection (3 shards x 3 replicas)
Hi,
I would like to convert a collection (3 shards x 3 replicas) from TLOG to
NRT.
The only solution I imagine is something like :
* with collection API, remove replicas in order to keep only 1 replica per
3 shard
* update the collection state.json in zookeer
* with collection API, reload the
est?
>
> > Am 25.10.2019 um 09:16 schrieb Dominique Bejean <
> dominique.bej...@eolya.fr>:
> >
> > Hi,
> >
> > I made some benchmarks for bulk indexing in order to compare performances
> > and ressources usage for NRT versus TLOG replica.
> >
>
10/25/2019 1:16 AM, Dominique Bejean wrote:
> > For collection created with all replicas as NRT
> >
> > * Indexing time : 22 minutes
>
>
>
> > For collection created with all replicas as TLOG
> >
> > * Indexing time : 34 minutes
>
> NRT indexes sim
Hi,
Solr is not tested with Tomcat since version 4.
Why not using the embedded Jetty server ?
Regards
Dominique
Le mar. 15 oct. 2019 à 10:44, vikas shinde a écrit :
> Dear Solr team,
>
> Which is the latest Tomcat version that supports the latest Solr version
> 8.2.0?
>
> Also provide
Hi,
I made some benchmarks for bulk indexing in order to compare performances
and ressources usage for NRT versus TLOG replica.
Environnent :
* Solrcloud with 4 Solr nodes (8 Gb RAM, 4 Gb Heap)
* 1 collection with 2 shards x 2 replicas (all NRT or all TLOG)
* 1 core per Solr Server
Indexing of
1 - 100 of 156 matches
Mail list logo