Re: [OPEN-ILS-GENERAL] Evergreen Software Performance Analysis

2013-09-25 Thread Scott Myers
Hi Rogan,

The db work Command Prompt has done for KCLS is mostly configuration things, 
work mem, max connections, etc. They have been fine tuning all those settings 
to get the best performance. These settings wouldn't help other people as it 
would be dependent on each libraries load. Another change made by Command 
Prompt was to remove slony replication and move to pgpool. If anyone needs help 
doing the same with their database I would highly recommend Command Prompt.

As for work done by Catalyst, all work that is directly applicable and 
beneficial to the community has been added. Kyle Tomita 
https://launchpad.net/~tomitakyle and Fred Parks https://launchpad.net/~fparks 
have been the most active community members from our team with Kyle being the 
9th on the top contributors list as of 9/24/13.

Catalyst also shared a multithreaded bib reingest that greatly reduces the time 
needed to do a full reingest. We also plan to share the way that Catalyst 
deploys code to KCLS without downtime.

Catalyst considers itself part of the community and is actively working to add 
more value. We have developed a strong relationship with KCLS and enjoy working 
with them greatly and our relationship has allowed us to gain a strong 
understanding of Evergreen. We've got some interesting work that we are going 
to be doing in the near future for KCLS, and as we have in the past, that which 
is beneficial to the community will be shared.

If you would like detail on any of these items now, feel free to reach out to 
me. You have my cell phone number.

Thanks

Scott Myers


From: open-ils-general-boun...@list.georgialibraries.org 
[mailto:open-ils-general-boun...@list.georgialibraries.org] On Behalf Of Rogan 
Hamby
Sent: Tuesday, September 24, 2013 7:10 AM
To: Joshua D. Drake
Cc: Evergreen Discussion Group
Subject: Re: [OPEN-ILS-GENERAL] Evergreen  Software Performance Analysis

Picking back up an old thread...

I was hoping at some point to hear more about the db work Command Prompt has 
done for KCLS and perhaps see some work in git. I was sad to see that in the 
new LJ article that Jed Moffitt said that at this point KCLS has forked 
Evergreen so I suppose the work Catalyst and Command Prompt has done isn't 
relevant to the rest of the Evergreen community.  I suppose that also means 
that any experience gained in working on the KCLS system isn't transferrable.





On Thu, Aug 22, 2013 at 11:05 AM, Rogan Hamby 
rogan.ha...@yclibrary.netmailto:rogan.ha...@yclibrary.net wrote:
Hi Joshua,

I don't know if you had a chance to see my message below so I'll copy you in 
directly as well and maybe touch base again after labor day.  With the 
Evergreen community having a rich collection of input from various contributors 
(many like yourself paid to do individual development by community members) all 
participating in the open source spirit and putting their code out there, 
allowing others to build on top of it or modify it or package it into master it 
would be exciting to see this work since you've indicated it's had a big impact 
for your customers.

I did a quick mark mail search since I sometimes lose emails to spam filters 
and noticed that back in Feb you mentioned that your Evergreen customer has 
been KCLS.  I know that at the conference they talked about setting up a public 
repo that would be available right after the conference.  Maybe they can chime 
in on an update on that?


On Fri, Aug 9, 2013 at 11:52 AM, Rogan Hamby 
rogan.ha...@yclibrary.netmailto:rogan.ha...@yclibrary.net wrote:
HI Josh,

Can you share with folks some more specifics?

For example:

In regards to optimizing the conf file can you share what kind of optimizations 
and the benchmarks?  E.g. with X records we see Y performance in activity Z.

A lot of other changes obviously touch on changes to code and/or schema 
changes.  Are these going to be released on a public repo or fed back into 
master?




On Thu, Aug 8, 2013 at 2:01 PM, Joshua D. Drake 
j...@commandprompt.commailto:j...@commandprompt.com wrote:

On 08/07/2013 10:12 AM, Rogan Hamby wrote:
I'm guessing maybe Joshua doesn't keep track of the list serv but is
there someone else from Command Prompt or whomever they did the
development work for that could chime in?  When he says they've made
improvements do those include GPLed code?

Sorry folks, I do watch this list but not as much as the postgresql lists. We 
have also been very busy. Here are some of the basic things we have done:

1. Optimized the postgresql.conf, it is amazing how much you can get from some 
minor tweaks after some performance analysis.

2. Converted some of the procedures to C, for example translate_isbn1013

3. Modified the holds process to use a look up table.

4. Changed the process for holds so they don't indefinitely exist but get 
migrated out for reporting but does not affect performance of the active table.

5. Partitioning of larger tables

6. Upgraded versions of PostgreSQL to more modern versions

[OPEN-ILS-GENERAL] ***SPAM*** RE: Evergreen Software Performance Analysis

2013-09-25 Thread Scott Myers
Mike,

The multithreaded reingest project was shared during the hackathon at the last 
evergreen conference.

Here is a link to what we ended up running for moving KCLS from 2.1 to 2.2.

https://github.com/CatalystIT/multithread_2_2_update

The files to pay attention to are the data_update_driver.pl and the 
update_driver.pl both have pod files attached with quite a few comments on how 
they work. 

If I can clear up what that means basically we created driver files that divide 
large amounts of data into smaller chunks and run those on multiple connections 
for cpu bound updates. A good example is the 2.1-2.2  which had changes in how 
the data was stored in the metabib field entry tables. This was a very CPU 
bound update and ended up being run with 32 simultaneous connections to reduce 
the amount of estimated time from 5 days to complete in 4 hours. 

Let me know if you have questions on how this can be setup or run. 

Thanks

Scott Myers

-Original Message-
From: open-ils-general-boun...@list.georgialibraries.org 
[mailto:open-ils-general-boun...@list.georgialibraries.org] On Behalf Of Mike 
Rylander
Sent: Wednesday, September 25, 2013 1:41 PM
To: Evergreen Discussion Group
Subject: Re: [OPEN-ILS-GENERAL] Evergreen  Software Performance Analysis

Scott,

I echo Rogan's down-thread thanks for following up here.

I'm curious where the multi-threaded reingest project is shared.  I can't find 
anything like that searching any of the Evergreen the mailing lists or 
launchpad for terms like ingest and multi.
Perhaps I'm just missing it.  Some interest was expressed in the community IRC 
channel, but also some confusion as to what exactly that means.

TIA,

--
Mike Rylander
 | Director of Research and Development
 | Equinox Software, Inc. / Your Library's Guide to Open Source  | phone:  
1-877-OPEN-ILS (673-6457)  | email:  mi...@esilibrary.com  | web:  
http://www.esilibrary.com


On Wed, Sep 25, 2013 at 3:50 PM, Scott Myers
smy...@catalystitservices.com wrote:
 Hi Rogan,



 The db work Command Prompt has done for KCLS is mostly configuration things,
 work mem, max connections, etc. They have been fine tuning all those
 settings to get the best performance. These settings wouldn't help other
 people as it would be dependent on each libraries load. Another change made
 by Command Prompt was to remove slony replication and move to pgpool. If
 anyone needs help doing the same with their database I would highly
 recommend Command Prompt.



 As for work done by Catalyst, all work that is directly applicable and
 beneficial to the community has been added. Kyle Tomita
 https://launchpad.net/~tomitakyle and Fred Parks
 https://launchpad.net/~fparks have been the most active community members
 from our team with Kyle being the 9th on the top contributors list as of
 9/24/13.



 Catalyst also shared a multithreaded bib reingest that greatly reduces the
 time needed to do a full reingest. We also plan to share the way that
 Catalyst deploys code to KCLS without downtime.



 Catalyst considers itself part of the community and is actively working to
 add more value. We have developed a strong relationship with KCLS and enjoy
 working with them greatly and our relationship has allowed us to gain a
 strong understanding of Evergreen. We've got some interesting work that we
 are going to be doing in the near future for KCLS, and as we have in the
 past, that which is beneficial to the community will be shared.



 If you would like detail on any of these items now, feel free to reach out
 to me. You have my cell phone number.



 Thanks



 Scott Myers





 From: open-ils-general-boun...@list.georgialibraries.org
 [mailto:open-ils-general-boun...@list.georgialibraries.org] On Behalf Of
 Rogan Hamby
 Sent: Tuesday, September 24, 2013 7:10 AM
 To: Joshua D. Drake
 Cc: Evergreen Discussion Group
 Subject: Re: [OPEN-ILS-GENERAL] Evergreen  Software Performance Analysis



 Picking back up an old thread...



 I was hoping at some point to hear more about the db work Command Prompt has
 done for KCLS and perhaps see some work in git. I was sad to see that in the
 new LJ article that Jed Moffitt said that at this point KCLS has forked
 Evergreen so I suppose the work Catalyst and Command Prompt has done isn't
 relevant to the rest of the Evergreen community.  I suppose that also means
 that any experience gained in working on the KCLS system isn't
 transferrable.











 On Thu, Aug 22, 2013 at 11:05 AM, Rogan Hamby rogan.ha...@yclibrary.net
 wrote:

 Hi Joshua,



 I don't know if you had a chance to see my message below so I'll copy you in
 directly as well and maybe touch base again after labor day.  With the
 Evergreen community having a rich collection of input from various
 contributors (many like yourself paid to do individual development by
 community members) all participating in the open source spirit and putting
 their code out there, allowing others to build on top of it or modify it or
 package