Add Hosts in SolrCloud

2020-09-28 Thread Massimiliano Randazzo
Hello everybody

I have a SolrCloud consisting of 4 Servers, I have a collection with 2
shars in replica 2

Collection: bookReaderAttilioHortis
Shard count: 2
configName: BookReader
replicationFactor: 2
maxShardsPerNode: 2
router: compositeId
autoAddReplicas: false

I would like to add 2 more servers bringing shards to 3 while keeping 2
replication to increase storage space and performance

Thank you in advance for your help

Thank you
Massimiliano Randazzo

-- 
Massimiliano Randazzo

Analista Programmatore,
Sistemista Senior
Mobile +39 335 6488039
email: massimiliano.randa...@gmail.com
pec: massimiliano.randa...@pec.net


Re: Time out problems with the Solr server 8.4.1

2020-02-27 Thread Massimiliano Randazzo
Thank you,

I proceed with installing the system directly on the server where I have
the data folder by removing NFS and I will let you know

Il giorno gio 27 feb 2020 alle ore 10:52 Dario Rigolin <
dario.rigo...@comperio.it> ha scritto:

> I this the issue is NFS. If you mode all to a NVMe or SSD local to the
> server indexing process will work fine.
> NFS is the wrong filesystem for solr.
>
> I hope this helps.
>
> Il giorno gio 27 feb 2020 alle ore 00:03 Massimiliano Randazzo <
> massimiliano.randa...@gmail.com> ha scritto:
>
> > Il giorno mer 26 feb 2020 alle ore 23:42 Vincenzo D'Amore <
> > v.dam...@gmail.com> ha scritto:
> >
> > > Hi Massimiliano,
> > >
> > > it’s not clear how much memory you have configured for your Solr
> > instance.
> > >
> >
> > SOLR_HEAP="20480m"
> > SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
> > GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
> >   -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
> >
> > > And I would avoid an nfs mount for the datadir.
> > >
> > > Ciao,
> > > Vincenzo
> > >
> > > --
> > > mobile: 3498513251
> > > skype: free.dev
> > >
> > > > On 26 Feb 2020, at 19:44, Massimiliano Randazzo <
> > > massimiliano.randa...@gmail.com> wrote:
> > > >
> > > > Il giorno mer 26 feb 2020 alle ore 19:30 Dario Rigolin <
> > > > dario.rigo...@comperio.it> ha scritto:
> > > >
> > > >> You can avoid commit and leave solr do autocommit at certain times.
> > > >> Or use softcommit if you have search queries at the same time to
> > answer.
> > > >> 55 pages of 3500 words isn't a big deal for a solr server,
> what's
> > > the
> > > >> hardware configuration?
> > > > The solr instance runs on a server with the following configuration:
> > > > 12 core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
> > > > 64GB Ram
> > > > solr's DataDir is on a volume of another server that I mounted via
> NFS
> > (I
> > > > was thinking of moving the solr server to the server where the
> DataDir
> > > > resides even if it has lower characteristics 8 core Intel(R) Xeon(R)
> > CPU
> > > >   E5506  @ 2.13GHz 24GB Ram)
> > > >
> > > > What's you single solr document a single newspaper? a single page?
> > > >
> > > > the single solr document refers to the single word of the document
> > > >
> > > >
> > > >> Do you have a solrcloud with 8 nodes? Or are you sending same
> document
> > > to 8
> > > >> single solr servers?
> > > >> I have 8 servers that process 550,000 newspapers and all of them
> write
> > > on
> > > > 1 solr server only
> > > >
> > > >
> > > >>> Il giorno mer 26 feb 2020 alle ore 19:22 Massimiliano Randazzo <
> > > >>> massimiliano.randa...@gmail.com> ha scritto:
> > > >>> Good morning
> > > >>> I have the following situation I have to index the OCR of about
> > 550,000
> > > >>> pages of newspapers counting an average of 3,500 words per page and
> > > >> making
> > > >>> a document per word the records are many.
> > > >>> At the moment I have 1 instance of Solr and 8 servers that read and
> > > write
> > > >>> all on the same instance at the same time, at the beginning
> > everything
> > > is
> > > >>> fine after a while when I add, delete or commit it gives me a
> TimeOut
> > > >> error
> > > >>> towards the solr server.
> > > >>> I suspect the problem is due to the fact that it is that I do many
> > > commit
> > > >>> operations of many docs at a time (practically if the newspaper is
> 30
> > > >> pages
> > > >>> I do 105,000 add and in the end I commit), if everyone does this
> and
> > 8
> > > >>> servers within walking distance of each other I think this creates
> > > >> problems
> > > >>> for Solr.
> > > >>> What can I do to solve the problem?
> > > >>> Do I make a commi to each add?
> > > >>> Is it possible to configure the solr server to apply the 

Re: Time out problems with the Solr server 8.4.1

2020-02-26 Thread Massimiliano Randazzo
Il giorno mer 26 feb 2020 alle ore 23:42 Vincenzo D'Amore <
v.dam...@gmail.com> ha scritto:

> Hi Massimiliano,
>
> it’s not clear how much memory you have configured for your Solr instance.
>

SOLR_HEAP="20480m"
SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
  -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"

> And I would avoid an nfs mount for the datadir.
>
> Ciao,
> Vincenzo
>
> --
> mobile: 3498513251
> skype: free.dev
>
> > On 26 Feb 2020, at 19:44, Massimiliano Randazzo <
> massimiliano.randa...@gmail.com> wrote:
> >
> > Il giorno mer 26 feb 2020 alle ore 19:30 Dario Rigolin <
> > dario.rigo...@comperio.it> ha scritto:
> >
> >> You can avoid commit and leave solr do autocommit at certain times.
> >> Or use softcommit if you have search queries at the same time to answer.
> >> 55 pages of 3500 words isn't a big deal for a solr server, what's
> the
> >> hardware configuration?
> > The solr instance runs on a server with the following configuration:
> > 12 core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
> > 64GB Ram
> > solr's DataDir is on a volume of another server that I mounted via NFS (I
> > was thinking of moving the solr server to the server where the DataDir
> > resides even if it has lower characteristics 8 core Intel(R) Xeon(R) CPU
> >   E5506  @ 2.13GHz 24GB Ram)
> >
> > What's you single solr document a single newspaper? a single page?
> >
> > the single solr document refers to the single word of the document
> >
> >
> >> Do you have a solrcloud with 8 nodes? Or are you sending same document
> to 8
> >> single solr servers?
> >> I have 8 servers that process 550,000 newspapers and all of them write
> on
> > 1 solr server only
> >
> >
> >>> Il giorno mer 26 feb 2020 alle ore 19:22 Massimiliano Randazzo <
> >>> massimiliano.randa...@gmail.com> ha scritto:
> >>> Good morning
> >>> I have the following situation I have to index the OCR of about 550,000
> >>> pages of newspapers counting an average of 3,500 words per page and
> >> making
> >>> a document per word the records are many.
> >>> At the moment I have 1 instance of Solr and 8 servers that read and
> write
> >>> all on the same instance at the same time, at the beginning everything
> is
> >>> fine after a while when I add, delete or commit it gives me a TimeOut
> >> error
> >>> towards the solr server.
> >>> I suspect the problem is due to the fact that it is that I do many
> commit
> >>> operations of many docs at a time (practically if the newspaper is 30
> >> pages
> >>> I do 105,000 add and in the end I commit), if everyone does this and 8
> >>> servers within walking distance of each other I think this creates
> >> problems
> >>> for Solr.
> >>> What can I do to solve the problem?
> >>> Do I make a commi to each add?
> >>> Is it possible to configure the solr server to apply the add and delete
> >>> commands, and to commit it, the server autonomously supports the
> >> available
> >>> resources as it seems to do for the optmized command?
> >>> Reading the documentation I would have found this configuration to
> >>> implement but not if it solves my problem
> >>> 
> >>> 1
> >>> 0
> >>>  >>
> name="maxCommitAge">1DAYfalse
> >>> Thanks for your consideration
> >>> Massimiliano Randazzo
> >> --
> >> Dario Rigolin
> >> Comperio srl - CTO
> >> Mobile: +39 347 7232652 - Office: +39 0425 471482
> >> Skype: dario.rigolin
> >
> >
> > --
> > Massimiliano Randazzo
> >
> > Analista Programmatore,
> > Sistemista Senior
> > Mobile +39 335 6488039
> > email: massimiliano.randa...@gmail.com
> > pec: massimiliano.randa...@pec.net
>


-- 
Massimiliano Randazzo

Analista Programmatore,
Sistemista Senior
Mobile +39 335 6488039
email: massimiliano.randa...@gmail.com
pec: massimiliano.randa...@pec.net


Re: Time out problems with the Solr server 8.4.1

2020-02-26 Thread Massimiliano Randazzo
Il giorno mer 26 feb 2020 alle ore 19:30 Dario Rigolin <
dario.rigo...@comperio.it> ha scritto:

> You can avoid commit and leave solr do autocommit at certain times.
> Or use softcommit if you have search queries at the same time to answer.
> 55 pages of 3500 words isn't a big deal for a solr server, what's the
> hardware configuration?
>
The solr instance runs on a server with the following configuration:
12 core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
64GB Ram
solr's DataDir is on a volume of another server that I mounted via NFS (I
was thinking of moving the solr server to the server where the DataDir
resides even if it has lower characteristics 8 core Intel(R) Xeon(R) CPU
E5506  @ 2.13GHz 24GB Ram)

What's you single solr document a single newspaper? a single page?
>

the single solr document refers to the single word of the document


> Do you have a solrcloud with 8 nodes? Or are you sending same document to 8
> single solr servers?
>
> I have 8 servers that process 550,000 newspapers and all of them write on
1 solr server only


> Il giorno mer 26 feb 2020 alle ore 19:22 Massimiliano Randazzo <
> massimiliano.randa...@gmail.com> ha scritto:
>
> > Good morning
> >
> > I have the following situation I have to index the OCR of about 550,000
> > pages of newspapers counting an average of 3,500 words per page and
> making
> > a document per word the records are many.
> >
> > At the moment I have 1 instance of Solr and 8 servers that read and write
> > all on the same instance at the same time, at the beginning everything is
> > fine after a while when I add, delete or commit it gives me a TimeOut
> error
> > towards the solr server.
> >
> > I suspect the problem is due to the fact that it is that I do many commit
> > operations of many docs at a time (practically if the newspaper is 30
> pages
> > I do 105,000 add and in the end I commit), if everyone does this and 8
> > servers within walking distance of each other I think this creates
> problems
> > for Solr.
> >
> > What can I do to solve the problem?
> > Do I make a commi to each add?
> > Is it possible to configure the solr server to apply the add and delete
> > commands, and to commit it, the server autonomously supports the
> available
> > resources as it seems to do for the optmized command?
> > Reading the documentation I would have found this configuration to
> > implement but not if it solves my problem
> >
> > 
> >   1
> >   0
> >>
> name="maxCommitAge">1DAYfalse
> >
> >
> >
> > Thanks for your consideration
> > Massimiliano Randazzo
> >
>
>
> --
>
> Dario Rigolin
> Comperio srl - CTO
> Mobile: +39 347 7232652 - Office: +39 0425 471482
> Skype: dario.rigolin
>


-- 
Massimiliano Randazzo

Analista Programmatore,
Sistemista Senior
Mobile +39 335 6488039
email: massimiliano.randa...@gmail.com
pec: massimiliano.randa...@pec.net


Time out problems with the Solr server 8.4.1

2020-02-26 Thread Massimiliano Randazzo
Good morning

I have the following situation I have to index the OCR of about 550,000
pages of newspapers counting an average of 3,500 words per page and making
a document per word the records are many.

At the moment I have 1 instance of Solr and 8 servers that read and write
all on the same instance at the same time, at the beginning everything is
fine after a while when I add, delete or commit it gives me a TimeOut error
towards the solr server.

I suspect the problem is due to the fact that it is that I do many commit
operations of many docs at a time (practically if the newspaper is 30 pages
I do 105,000 add and in the end I commit), if everyone does this and 8
servers within walking distance of each other I think this creates problems
for Solr.

What can I do to solve the problem?
Do I make a commi to each add?
Is it possible to configure the solr server to apply the add and delete
commands, and to commit it, the server autonomously supports the available
resources as it seems to do for the optmized command?
Reading the documentation I would have found this configuration to
implement but not if it solves my problem


  1
  0
  1DAYfalse



Thanks for your consideration
Massimiliano Randazzo


Re: Optimize solr 8.4.1

2020-02-26 Thread Massimiliano Randazzo
Hi Paras,

thank you for your answer if you don't mind I would have a couple of
questions

I am experiencing very long indexing times I have 8 servers for currently
working on 1 instance of Solr, I thought of moving to a cloud of 4 solr
servers with 3 zookeeeper servers to distribute the load but I was
wondering if I had to start over with the indexing or if there was a tool
to load the index of a Solr into a SolrCloud by redistributing the load?

Currently in the "managed-schema" file I have configured the fields to be
indexed type="text_it" to which "lang/stopwords_it.txt" is assigned they
ask me to remove the stopwords, if I modify the "managed-schema" file I
remove the stopwords file Is it possible to re-index the database without
having to reload all the material but taking the documents already present?

Thank you
Massimiliano Randazzo

Il giorno mer 26 feb 2020 alle ore 13:26 Paras Lehana <
paras.leh...@indiamart.com> ha scritto:

> Hi Massimiliano,
>
> Is it still necessary to run the Optimize command from my application when
> > I have finished indexing?
>
>
> I guess you can stop worrying about optimizations and let Solr handle that
> implicitly. There's nothing so bad about having more segments.
>
> On Wed, 26 Feb 2020 at 16:02, Massimiliano Randazzo <
> massimiliano.randa...@gmail.com> wrote:
>
> > > Good morning,
> > >
> > > recently I went from version 6.4 to version 8.4.1, I access solerre
> > > through java applications written by me to which I have updated the
> > > solr-solrj-8.4.1.jar libraries.
> > >
> > > I am performing the OCR indexing of a newspaper of about 550,000 pages
> in
> > > production for which I have calculated at least 1,000,000,000 words
> and I
> > > am experiencing slowness I wanted to know if you could advise me on
> > changes
> > > to the configuration.
> > >
> > > The server I'm using is a server with 12 cores and 64GB of Ram, the
> only
> > > changes I made in the configuration are:
> > > Solr.in.sh <http://solr.in.sh/> file
> > > SOLR_HEAP="20480m"
> > > SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
> > > GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
> > >   -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> > > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
> > > The Java version I use is
> > > java version "1.8.0_51"
> > > Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
> > > Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
> > >
> > > Also comparing the solr web interface I noticed a difference in the
> > > "Overview" page in solr 6.4 it was affected Optimized and Current and
> > > allowed me to launch Optimized if necessary, in version 8.41 Optimized
> is
> > > no longer present I hypothesized that this activity is done with the
> > commit
> > > or through some operation in the backgroup, if this were so, is it
> still
> > > necessary to run the Optimize command from my application when I have
> > > finished indexing? I noticed that the Optimized function requires
> > > considerable time and resources especially in large databases
> > >
> > > Thank you for your attention
> >
> > Massimiliano Randazzo
> >
> > >
> > >
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, *Auto-Suggest*,
> IndiaMART InterMESH Ltd,
>
> 11th Floor, Tower 2, Assotech Business Cresterra,
> Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305
>
> Mob.: +91-9560911996
> Work: 0120-4056700 | Extn:
> *1196*
>
> --
> *
> *
>
>  <https://www.facebook.com/IndiaMART/videos/578196442936091/>
>


-- 
Massimiliano Randazzo

Analista Programmatore,
Sistemista Senior
Mobile +39 335 6488039
email: massimiliano.randa...@gmail.com
pec: massimiliano.randa...@pec.net


Optimize solr 8.4.1

2020-02-26 Thread Massimiliano Randazzo
> Good morning,
>
> recently I went from version 6.4 to version 8.4.1, I access solerre
> through java applications written by me to which I have updated the
> solr-solrj-8.4.1.jar libraries.
>
> I am performing the OCR indexing of a newspaper of about 550,000 pages in
> production for which I have calculated at least 1,000,000,000 words and I
> am experiencing slowness I wanted to know if you could advise me on changes
> to the configuration.
>
> The server I'm using is a server with 12 cores and 64GB of Ram, the only
> changes I made in the configuration are:
> Solr.in.sh <http://solr.in.sh/> file
> SOLR_HEAP="20480m"
> SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
> GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
>   -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
> The Java version I use is
> java version "1.8.0_51"
> Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
> Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
>
> Also comparing the solr web interface I noticed a difference in the
> "Overview" page in solr 6.4 it was affected Optimized and Current and
> allowed me to launch Optimized if necessary, in version 8.41 Optimized is
> no longer present I hypothesized that this activity is done with the commit
> or through some operation in the backgroup, if this were so, is it still
> necessary to run the Optimize command from my application when I have
> finished indexing? I noticed that the Optimized function requires
> considerable time and resources especially in large databases
>
> Thank you for your attention

Massimiliano Randazzo

>
>


Optimize sole 8.4.1

2020-02-25 Thread Massimiliano Randazzo
Good morning,

recently I went from version 6.4 to version 8.4.1, I access solerre through
java applications written by me to which I have updated the
solr-solrj-8.4.1.jar libraries.

I am performing the OCR indexing of a newspaper of about 550,000 pages in
production for which I have calculated at least 1,000,000,000 words and I
am experiencing slowness I wanted to know if you could advise me on changes
to the configuration.

The server I'm using is a server with 12 cores and 64GB of Ram, the only
changes I made in the configuration are:
Solr.in.sh  file
SOLR_HEAP="20480m"
SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
  -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
The Java version I use is
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)

Also comparing the solr web interface I noticed a difference in the
"Overview" page in solr 6.4 it was affected Optimized and Current and
allowed me to launch Optimized if necessary, in version 8.41 Optimized is
no longer present I hypothesized that this activity is done with the commit
or through some operation in the backgroup, if this were so, is it still
necessary to run the Optimize command from my application when I have
finished indexing? I noticed that the Optimized function requires
considerable time and resources especially in large databases

Thank you for your attention--
Inviato da Gmail Mobile