from:"James Smith"

RE: Is mod_perl for web app still active today? [EXT]

2022-09-21 Thread James Smith

It is declining - but that is that very few people used even 1% of it's 
functionality - they just used it to have a perl interpreter embedded in the 
webserver. It is and never was a web framework - it was more fundamental than 
that - it add functions to
Perl but you could do what you wanted with it

Plack (and Dancer) and Mojolicious are what most seem to be using now. These 
are good but lose a lot of the functionality that mod_perl has to offer {but 
then most people didn't use those parts anyway}

-Original Message-
From: Edward J. Sabol  
Sent: 20 September 2022 19:40
To: Ken Peng 
Cc: mod_perl list 
Subject: Re: Is mod_perl for web app still active today? [EXT]

On Sep 19, 2022, at 11:07 PM, Ken Peng  wrote:
> May I know if mod_perl is still active on web development today?

I'm not sure what you're asking.

Do some people still use it for web development? Yes, most definitely. However, 
I think it's safe to say it's declined in popularity compared to other Perl web 
development frameworks such as Plack and Mojolicious and obviously other 
non-Perl web frameworks, but it's still used by some web developers.

Is mod_perl still maintained? Yes. The latest release (2.0.12) was released in 
late January 2022.

Later,
Ed



--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

RE: is mpm_event safe for modperl handler? [EXT]

2022-08-10 Thread James Smith

People use nginx - probably because they haven't come across mpm_event. The two 
are I think comparable speed wise, but out of the box mpm_event is more 
flexible - and the nice thing is that you can develop a site on a single 
mpm_worker instance and easily migrate over to the mpm_event/mpm_worker module.

It is nice to use the same configuration language on both sites...

mpm_event like all Apache mpm_s is much better at handling "issues" where nginx 
just goes F* and returns nothing. When we have issues we can detect them on 
apache mpms - but we often fail to see nginx errors as it sort of just dies!


-Original Message-
From: pengyh  
Sent: 09 August 2022 13:04
To: James Smith ; modperl@perl.apache.org
Subject: Re: is mpm_event safe for modperl handler? [EXT]

I think running Ignix as front-end server and mod_perl for backend server is 
the more popular choice.



> If you want the speed of mod_event for static content and the power of 
> mod_perl for dynamic content - the best way is to run a lightweight mod_event 
> apache in front of a mod_prefork to run the mod_perl



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: is mpm_event safe for modperl handler? [EXT]

2022-08-09 Thread James Smith

If you want the speed of mod_event for static content and the power of mod_perl 
for dynamic content - the best way is to run a lightweight mod_event apache in 
front of a mod_prefork to run the mod_perl

tbh this is exactly the set up most people use for other heavy backends 
(mod_fastcgi etc) but you still benefit from the power of mod_perl for your perl

I do this anyway as the mod_event bounces really quickly in comparison to the 
heavier mod_worker client (especially if you pre-cache a lot of your perl) so 
can handle "maintenance" events much better... Also all the stuff like SSL etc 
is handled on the mod_event instance keeping the mod_worker instance clean..

It also allows you to run two apaches on a single box - a "test" and a "live"

-Original Message-
From: pengyh  
Sent: 05 August 2022 02:21
To: modperl@perl.apache.org
Subject: Re: is mpm_event safe for modperl handler? [EXT]


thanks. that does sound sorry.


> No and neither is mod_worker. The only mpm you can safely use is 
> prefork. This is, in my opinion, mod_perl's fatal flaw which will doom it.



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Sharing read/WRITE data between threads? [EXT]

2021-08-25 Thread James Smith

The other problem with sharing writable data in "memory" is that it would not 
necessarily shared between multiple server instances. We run multiple mod_perl 
instances for reliability. Agree with other posters either use something like 
redis or memcache (possibly backed with a database).

-Original Message-
From: David Booth  
Sent: 25 August 2021 00:51
To: modperl 
Subject: Sharing read/WRITE data between threads? [EXT]

I am using Apache/2.4.41 (Ubuntu), with mod_perl.  Apache uses multiple 
threads, and I want to share read/WRITE data between threads.  (I.e., I want to 
be able to modify some shared data in one thread, such that other threads can 
see those changes.)  In "Practical mod_perl" Stas Bekman describes how to share 
read-only data between threads, but says nothing about how to share WRITABLE 
data between threads.

Any clues about how this can be done?  I've searched high and low and found 
nothing.  I will also want to know what mechanisms are available to coordinate 
access to that shared data, such as locks or semaphores.

I also posted this message to StackOverflow, but got no response so far:
https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_68901260_how-2Dto-2Dshare-2Dread-2Dwrite-2Ddata-2Din-2Dmod-2Dperl-2Dapache2&d=DwICaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=qpyZVoG2Lx8wqNIB_pQ9wXQkohMh_5Q0HVZmgqmlSbs&s=oUnmv2w8aVNzfIni8nxA0CFh-xrZv1GS8jFquKbzsQM&e=

Any help would be appreciated!

Thanks,
David Booth

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: mod_perl alternatives [EXT]

2021-03-18 Thread James Smith

The problem is I don't think there is mod_perl is quite a unique infrastructure 
- across all language I believe! I don't think any other language/framework 
gives you this level of flexibility.

Most frameworks just concentrate on the request phase and shoe horn everything 
in there - so you can't mix and match which technologies you use for different 
parts of the release cycle.

I looked at rewriting our framework in PSGI - dancer - and although it is 
possible we would have had to throw away 50% of the ultra-cool features or 
implement a fake request cycle with the request phase {to mimic what most 
dancer developers do anyway} but then putting the configurable logic in would 
add a whole new issue - apache has this nice config framework all setup you can 
use.

The other bit that as missing was the non-Perl part to be able to have 
different parts of the process handled by different languages (even the 
response phase)

-Original Message-
From: Jim Albert  
Sent: 18 March 2021 04:01
To: modperl@perl.apache.org
Subject: mod_perl alternatives [EXT]

Given the recent discussion on the need for mod_perl PMC members and the 
disclosure that there is no active development on mod_perl this seems like an 
appropriate time to start a thread on a discussion of mod_perl alternatives 
inline with the various means of using mod_perl from the low level use of 
interfacing with the Apache server to the quick and dirty stuff 
(ModPerl::PerlRun, I believe to keep Perl and modules in memory).

I've seen mod_fcgid proposed in posts on other forums. Has anyone played with 
alternatives? I expect the low level Apache interaction might be difficult to 
duplicate at least to continue to do so in Perl. Perhaps the ModPerl::PerlRun 
approach of keeping Perl and modules in memory is a potential starting point 
for discussion for those using mod_perl at the most basic level.

Jim






-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-09 Thread James Smith

The advantage of the web proxy is not from securing your app - although there 
are things you can do on the reverse proxy to secure less secure apps

It's main advantage is that it doesn't run a large software stack - and so it 
makes it harder for people to compromise your front end and then compromise 
your internal networks, and even then they have to get from that DMZ into your 
main infrastructure.

We go a step further at work. We have a DMZ <- a web zone <- internal zone - so 
even if you can compromise the DMZ and the web servers you still don't have 
direct access to the other machines - taking servers + desktop machines - 
something like 30-50K cores.

-Original Message-
From: Clive Eisen  
Sent: 09 February 2021 19:23
To: Rafael Caceres 
Cc: James Smith ; Vincent Veyron ; 
modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

> On 9 Feb 2021, at 19:16, Rafael Caceres  wrote:
> 
> Another thing that can be done is keep the app server + DB inside your LAN 
> and place a reverse proxy on your DMZ, that adds some level of protection.

Not really - the only protection is if all your apis or web pages are secure - 
the reverse proxy does not help or hinder that.

— 
C

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-09 Thread James Smith

Mithun,

I’m not sure on what scale you work – but these are from experience in sites 
with small to medium load – and we rarely see an appreciable gain in using 
cached or pooled connections, just the occasional heartache they cause.
If you are working on small applications with a minimal number of databases on 
the DB server then you may see some performance improvement (but tbh not as 
much as you used to – as the servers have changed) Unfortunately I don’t in 
both my main and secondary roles, and I know many others who come across these 
limitations as well.

I’m not saying don’t use persistent or cached connections – but leaving it to 
some hidden layers is not necessarily a good thing to do – it can have 
unforeseen side effects {and Apache::DBI & PHP pconnect have both shown these 
up}

If you are working with e.g. with MySQL the overhead of the (socket) connection 
is very small, but having more connections open to cope with persistent 
connections {memory wise} often needs specifying a much large database server – 
or not being able to do all the nice tricks to in memory indexes and queries 
[to increase query performance]. Being able to chose which connections you keep 
open and which you open/close on a per request basis gives you the benefits of 
caching without the risks involved [other than the “lock table” issue].

From: Mithun Bhattacharya 
Sent: 09 February 2021 18:34
To: mod_perl list 
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

Connection caching does work for most use cases - we have to accept James works 
in scenarios most developers can't fathom :)

If you are just firing off simple SQL's without any triggers or named temporary 
tables involved you should be good. The only times we recall tripping on cached 
connection is when two different code snippets tried to create the same 
temporary table. Another time the code was expecting the disconnect to complete 
the connection cleanup.

On Tue, Feb 9, 2021 at 11:47 AM Vincent Veyron 
mailto:vv.li...@wanadoo.fr>> wrote:
On Sun, 7 Feb 2021 20:21:34 +
James Smith mailto:j...@sanger.ac.uk>> wrote:

Hi James,

> DBI sharing doesn't really gain you much - and can actually lead you into a 
> whole world of pain. It isn't actually worth turning it on at all.
>

Never had a problem with it myself in years of using it, but I wrap my queries 
in an eval { } and check $@, so that the scripts are not left hanging; also I 
have a postgresql db ;-).

I ran some tests with ab, I do see an improvement in response speed :

my $dbh = DBI->connect()
Concurrency Level:  5
Time taken for tests:   22.198 seconds
Complete requests:  1000
Failed requests:0
Total transferred:  8435000 bytes
HTML transferred:   8176000 bytes
Requests per second:45.05 [#/sec] (mean)
Time per request:   110.990 [ms] (mean)
Time per request:   22.198 [ms] (mean, across all concurrent requests)
Transfer rate:  371.08 [Kbytes/sec] received

my $dbh = DBI->connect_cached()
Concurrency Level:  5
Time taken for tests:   15.133 seconds
Complete requests:  1000
Failed requests:0
Total transferred:  8435000 bytes
HTML transferred:   8176000 bytes
Requests per second:66.08 [#/sec] (mean)
Time per request:   75.664 [ms] (mean)
Time per request:   15.133 [ms] (mean, across all concurrent requests)
Transfer rate:  544.33 [Kbytes/sec] received

--

Bien à vous, Vincent Veyron

https://compta.libremen.com 
[compta.libremen.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=CnIW-j3Bw_IfohZCciiwtkoqvr6nV2hHrNYMPpEOe8E&s=uf6Qi4tnTPryVuPvOKwfZQcFOksecWyn-LYPDVj44lY&e=>
Logiciel libre de comptabilité générale en partie double

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-09 Thread James Smith

It doesn't matter what db - and whether you wrap it in eval it is a problem 
(postgres has a similar problem - the one with least problems is MySQL) - if 
you have a secure environment where your databases are in a firewalled zone it 
will happen to all of them... It's a nasty bit of networking - it does mean our 
meant to be secure enterprise level apps running against Oracle and less secure 
and less stable than the other apps we have (go figure!)...

-Original Message-
From: Vincent Veyron  
Sent: 09 February 2021 17:47
To: modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

On Sun, 7 Feb 2021 20:21:34 +
James Smith  wrote:

Hi James,

> DBI sharing doesn't really gain you much - and can actually lead you into a 
> whole world of pain. It isn't actually worth turning it on at all.
> 

Never had a problem with it myself in years of using it, but I wrap my queries 
in an eval { } and check $@, so that the scripts are not left hanging; also I 
have a postgresql db ;-).

I ran some tests with ab, I do see an improvement in response speed :

my $dbh = DBI->connect()
Concurrency Level:  5
Time taken for tests:   22.198 seconds
Complete requests:  1000
Failed requests:0
Total transferred:  8435000 bytes
HTML transferred:   8176000 bytes
Requests per second:45.05 [#/sec] (mean)
Time per request:   110.990 [ms] (mean)
Time per request:   22.198 [ms] (mean, across all concurrent requests)
Transfer rate:  371.08 [Kbytes/sec] received

my $dbh = DBI->connect_cached()
Concurrency Level:  5
Time taken for tests:   15.133 seconds
Complete requests:  1000
Failed requests:0
Total transferred:  8435000 bytes
HTML transferred:   8176000 bytes
Requests per second:66.08 [#/sec] (mean)
Time per request:   75.664 [ms] (mean)
Time per request:   15.133 [ms] (mean, across all concurrent requests)
Transfer rate:  544.33 [Kbytes/sec] received


-- 

Bien à vous, Vincent Veyron 

https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=u0vYr2KXDAvFiif8YsX-7Uho_gsySe2x9Z3OHcD_Br4&s=7Hyp0l39Edk8cAZK0idIxVxKi3OQXhkR96T0T42b2tM&e=
 
Logiciel libre de comptabilité générale en partie double





--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread James Smith

That is a good sign – I would run with brutal at least once and see what it 
throws up

We tend to ignore a couple of the warnings – one is postfix if/unless and the 
other is multiline strings {we embed a lot of simple HTML templates in code and 
it means I can make the HTML readable when rendered rather than being one long 
string, plus SQL queries are more readable and heredocs are messy if you want 
to do concatenation or lots of printf/sprintf calls}

We have it as part of our svn commit pre-hooks so that people can’t push  
code to our repos and break things {one of the reasons we haven’t moved to git! 
as hooks are a bit messier… and people may not have the right software on the 
machines they have their repos on}



From: Steven Haigh 
Sent: 08 February 2021 09:54
To: modperl@perl.apache.org
Subject: RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

On Mon, Feb 8, 2021 at 09:13, James Smith 
mailto:j...@sanger.ac.uk>> wrote:

Use perl-critic this will find most of the nasties that you have the classic is:

Thanks for the tip! I have no idea how long I've been writing stuff in perl - 
and I never knew of this!

I ran it with the -3 option - which I figure is a good middle ground...

The good news, I just ran it over a lot of my code and it seems the only real 
things it picks up are not having a /x on the end of regex matches, using hard 
tabs, and multiline strings. I'd say that's a good sign.

It did pick up a couple of open statements that I didn't have a close for 
(*slaps wrist*), but I haven't seen much in the way of what looks to be major 
issues.

I was trying to find the PBP references - and was amazed that the Perl Best 
Practices *ebook* s $56.20 AUD hahahah

Amazon has a few copies listed second hand, with 3 weeks shipping The joys 
of being on an island a long way from anything ;)

--
Steven Haigh 📧 net...@crc.id.au<mailto:net...@crc.id.au> 💻 
https://www.crc.id.au 
[crc.id.au]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au_&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=e5xV9ANT9Dnmf1ept9VmObBssOhOe3Cci6goE99x62c&s=bK6vilo7Ud3M2_JhyA9RucR6k68i7T0pLK49RbBQbjY&e=>



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-08 Thread James Smith

Use perl-critic this will find most of the nasties that you have the classic is:

my $var = {code} if {condition};

The my gets round perl strict, but $var doesn’t get updated if {condition} 
isn’t met, so holds the variable from the last time round..

Better is

my $var = ‘’;
$var = {code} if {condition};

or

my $var = {condition} ? {code} : ‘’;

From: Steven Haigh 
Sent: 08 February 2021 09:09
To: modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

On Sun, Feb 7, 2021 at 15:17, Chris 
mailto:cpb_mod_p...@bennettconstruction.us>>
 wrote:

Just remember to always write clean code that resets variables after doing 
tasks.

I'm a bit curious about this - whilst I'm still testing all this on a staging 
environment, how can I tell if things can leak between runs?

Is coding to normal 'use strict; use warnings;' standards good enough?

Are there other ways to confirm correct operations?

--
Steven Haigh 📧 net...@crc.id.au 💻 
https://www.crc.id.au 
[crc.id.au]



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-08 Thread James Smith

I meant between requests in the same thread – it uses persistent connections 
which are bad in a number of ways {e.g. fun with locking if you use it and your 
script dies halfway through, keeps too many connections open and can block the 
database, has issues with secure set ups {firewall between web and servers for 
instance}

From: Wesley Peng 
Sent: 08 February 2021 00:29
To: modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

what's DBI sharing? do you mean Apache::DBI?
Does perl has Java similar DB connection pool?

Thanks.

On Mon, Feb 8, 2021, at 4:21 AM, James Smith wrote:
DBI sharing doesn't really gain you much - and can actually lead you into a 
whole world of pain. It isn't actually worth turning it on at all.

We use dedicated DB caching in the cases where we benefit from it as and when 
you need it (low level caching database)...

Although milage may vary - our problem was DB servers with 30 or 40 databases 
on them being connected from a number of approximately 50-100 child apaches, 
meant we ended up blowing up the connections to the database server really 
quickly...

-Original Message-
From: Vincent Veyron mailto:vv.li...@wanadoo.fr>>
Sent: 07 February 2021 19:06
To: Steven Haigh mailto:net...@crc.id.au>>
Cc: James Smith mailto:j...@sanger.ac.uk>>; 
modperl@perl.apache.org<mailto:modperl@perl.apache.org>
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

On Sun, 07 Feb 2021 23:58:17 +1100
Steven Haigh mailto:net...@crc.id.au>> wrote:
>
> I haven't gotten into the preload or DBI sharing yet - as that'll end
> up needing a bit of a rewrite of code to take advantage of. I'd be
> open to suggestions here from those who have done it in the past to
> save me going down some dead ends :D

I use mod_perl handlers, so not sure how it mixes with PerlRegistry, but you 
probably want to have a look at connect_cached

--

Bien à vous, Vincent Veyron

https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=C0OcuGbNbfxaSa8ASgV3uFXzejn7MpjIUH1aP1RbiyU&s=GPr8VuKQ3rZCzCPwggyAHdCOojK6ZThmShKk0Jb3maI&e=
Logiciel libre de comptabilité générale en partie double

--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom'modules'[EXT]

2021-02-07 Thread James Smith

ion times, I do see this improvement using 'ab -n 100 
> >> -c32':
> >> 
> >> Apache + ExecCGI: Requests per second:13.50 [#/sec] (mean)
> >> Apache + mod_perl: Requests per second:59.81 [#/sec] (mean)
> >> 
> >> This is obviously a good thing.
> >> 
> >> I haven't gotten into the preload or DBI sharing yet - as that'll 
> >> end up needing a bit of a rewrite of code to take advantage of. I'd 
> >> be open to suggestions here from those who have done it in the past 
> >> to save me going down some dead ends :D
> >> 
> >> --
> >> Steven Haigh 📧 net...@crc.id.au 💻 
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au&;
> >> d=DwIDaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1e
> >> cj4oDX0XM7vQ&m=jUH_UYtcqlwO076IYfxVYrdow4cIntukmuDhs07wzzE&s=EheaEd
> >> aj-0DWuhH62PQjVTVfPw5WRAazyGkJFaYUC8E&e=
> >> 
> >> On Sun, Feb 7, 2021 at 12:49, James Smith  wrote:
> >>> As welsey said – try Registry, that was the standard way of using 
> >>> mod_perl to cache perl in the server  – but your problem might be 
> >>> due to the note in PerlRun…
> >>> 
> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__perl.apache.o
> >>> rg_docs_2.0_api_ModPerl_PerlRun.html-23Description&d=DwIDaQ&c=D7By
> >>> GjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m
> >>> =jUH_UYtcqlwO076IYfxVYrdow4cIntukmuDhs07wzzE&s=X28W3EsAPwLXjwa4Im6
> >>> 8Ak7wzU4ss-EYYegNaEoii18&e=
> >>> META: document that for now we don't chdir() into the script's dir, 
> >>> because it affects the whole process under threads. 
> >>> `ModPerl::PerlRunPrefork 
> >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__perl.apache.org_docs_20_api_ModPerl_PerlRunPrefork.html&d=DwIDaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=jUH_UYtcqlwO076IYfxVYrdow4cIntukmuDhs07wzzE&s=UxzvtZHkq8JTl8nKRcp4326um33tXkdKZsEU6YlmFhQ&e=
> >>>  >` should be used by those who run only under prefork MPM.
> >>> {tbh most people don’t use mod perl under threads anyway as there 
> >>> isn’t really a gain from using them}
> >>> 
> >>> It suggests you use ModPerl/PerlRunPrefork – as this does an additional 
> >>> step to cd to the script directory – which might be your issue….
> 
> >>>  
> 
> >>> *From:* Steven Haigh 
> >>> *Sent:* 07 February 2021 01:00
> >>> *To:* modperl@perl.apache.org
> >>> *Subject:* Moving ExecCGI to mod_perl - performance and custom 
> >>> 'modules' [EXT]
> 
> >>>  
> 
> >>> Hi all,
> 
> >>>  
> 
> >>> So for many years I've been slack and writing perl scripts to do various 
> >>> things - but never needed more than the normal apache +ExecCGI and 
> >>> Template Toolkit.
> 
> >>>  
> 
> >>> One of my sites has become a bit more popular, so I'd like to spend a bit 
> >>> of time on performance. Currently, I'm seeing ~300-400ms of what I 
> >>> believe to be execution time of the script loading, running, and then 
> >>> blatting its output to STDOUT and the browser can go do its thing. 
> 
> >>>  
> 
> >>> I believe most of the delay would be to do with loading perl, its 
> >>> modules etc etc
> 
> >>>  
> 
> >>> I know that the current trend would be to re-write the entire site 
> >>> in a more modern, daemon based solution - and I started down the 
> >>> Mojolicious path - but the amount of re-writing to save 1/3rd of a 
> >>> second seems to be excessive
> 
> >>>  
> 
> >>> Would I be correct in thinking that mod_perl would help in this case?
> 
> >>>  
> 
> >>> I did try a basic test, but I have a 'use functions' in all my scripts 
> >>> that loads a .pm with some global vars and a lot of common subs - and for 
> >>> whatever reason (can't find anything on Google as to why), none of the 
> >>> subs are recognised in the main script when loaded via ModPerl::PerlRun.
> 
> >>>  
> 
> >>> So throwing it out to the list - am I on the right track? wasting my 
> >>> time? or just a simple mistake?
> 
> >>>  
> 
> >>> --
> 
> >>> Steven Haigh 📧 net...@crc.id.au 💻 
> >>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au
> >>> &d=DwIDaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge
> >>> 1ecj4oDX0XM7vQ&m=jUH_UYtcqlwO076IYfxVYrdow4cIntukmuDhs07wzzE&s=Ehe
> >>> aEdaj-0DWuhH62PQjVTVfPw5WRAazyGkJFaYUC8E&e=  [crc.id.au] 
> >>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.a
> >>> u_&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0
> >>> ge1ecj4oDX0XM7vQ&m=bosoTbkecbnrPukObNK-5Duc1p3JTllIM7_FHhBYKW4&s=v
> >>> QDi0ezyEZDOz86GVraerPdT76UjN2in3UdPh8fglRM&e=>
> 
> >>> -- The Wellcome Sanger Institute is operated by Genome Research Limited, 
> >>> a charity registered in England with number 1021457 and a company 
> >>> registered in England with number 2742969, whose registered office is 215 
> >>> Euston Road, London, NW1 2BE.



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-07 Thread James Smith

No it was one of our DBAs who told us to take if off…! It causes them all sorts 
of problems – the main one being it keeps connections open, which is the best 
way to completely foobar the database servers! We make them happy by turning it 
off. We’ve even had Oracle developers (sorry not people developing SQL in 
oracle but people developing Oracle) scratching their heads over things like 
this and not being able to come up with a solution…..

We’ve discovered lots of interesting side effects with connection pooling when 
using over a secure network – e.g. the Oracle client doesn’t like running 
through a firewall – as it can’t cope when the firewall drops the connection – 
both sides of the system the “pool” thinks the connection is open, the database 
thinks the connection is open, and the client just hangs….. We have seen 
similar things with other applications which use connection pooling and long 
duration database handles. We get round it with permanent MySQL connections by 
closing/re-opening them after 5 minutes of inactivity – hence the need to 
develop our own cache/pool….

From: Mithun Bhattacharya 
Sent: 07 February 2021 20:36
To: James Smith 
Cc: Vincent Veyron ; Steven Haigh ; 
modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

DBI sharing has it's own issues but in most use cases it does a pretty good job 
and keeps the DBA's happy - that is very important ;)

On Sun, Feb 7, 2021 at 2:23 PM James Smith 
mailto:j...@sanger.ac.uk>> wrote:
DBI sharing doesn't really gain you much - and can actually lead you into a 
whole world of pain. It isn't actually worth turning it on at all.

We use dedicated DB caching in the cases where we benefit from it as and when 
you need it (low level caching database)...

Although milage may vary - our problem was DB servers with 30 or 40 databases 
on them being connected from a number of approximately 50-100 child apaches, 
meant we ended up blowing up the connections to the database server really 
quickly...

-Original Message-
From: Vincent Veyron mailto:vv.li...@wanadoo.fr>>
Sent: 07 February 2021 19:06
To: Steven Haigh mailto:net...@crc.id.au>>
Cc: James Smith mailto:j...@sanger.ac.uk>>; 
modperl@perl.apache.org<mailto:modperl@perl.apache.org>
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

On Sun, 07 Feb 2021 23:58:17 +1100
Steven Haigh mailto:net...@crc.id.au>> wrote:
>
> I haven't gotten into the preload or DBI sharing yet - as that'll end
> up needing a bit of a rewrite of code to take advantage of. I'd be
> open to suggestions here from those who have done it in the past to
> save me going down some dead ends :D

I use mod_perl handlers, so not sure how it mixes with PerlRegistry, but you 
probably want to have a look at connect_cached

--

Bien à vous, Vincent Veyron

https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=C0OcuGbNbfxaSa8ASgV3uFXzejn7MpjIUH1aP1RbiyU&s=GPr8VuKQ3rZCzCPwggyAHdCOojK6ZThmShKk0Jb3maI&e=
Logiciel libre de comptabilité générale en partie double

--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-07 Thread James Smith

DBI sharing doesn't really gain you much - and can actually lead you into a 
whole world of pain. It isn't actually worth turning it on at all.

We use dedicated DB caching in the cases where we benefit from it as and when 
you need it (low level caching database)...

Although milage may vary - our problem was DB servers with 30 or 40 databases 
on them being connected from a number of approximately 50-100 child apaches, 
meant we ended up blowing up the connections to the database server really 
quickly...


-Original Message-
From: Vincent Veyron  
Sent: 07 February 2021 19:06
To: Steven Haigh 
Cc: James Smith ; modperl@perl.apache.org
Subject: Re: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

On Sun, 07 Feb 2021 23:58:17 +1100
Steven Haigh  wrote:
> 
> I haven't gotten into the preload or DBI sharing yet - as that'll end 
> up needing a bit of a rewrite of code to take advantage of. I'd be 
> open to suggestions here from those who have done it in the past to 
> save me going down some dead ends :D

I use mod_perl handlers, so not sure how it mixes with PerlRegistry, but you 
probably want to have a look at connect_cached


-- 

Bien à vous, Vincent Veyron 

https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=C0OcuGbNbfxaSa8ASgV3uFXzejn7MpjIUH1aP1RbiyU&s=GPr8VuKQ3rZCzCPwggyAHdCOojK6ZThmShKk0Jb3maI&e=
Logiciel libre de comptabilité générale en partie double



--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

RE: [EXT]

2021-02-07 Thread James Smith

This can be a bigger gain if you are limited with memory as multiple processes 
will share the same block of physical memory { so limiting swap } – this is one 
of the advantages of the mod_perl approach over the app approach in things like 
dancer. You have the flexibility to include what you want – if you have a 
single webpage or service which needs a large number of modules – but is 
perhaps used once in every 100k requests you don’t have to load it in at the 
start of the process, and you have the flexibility to choose how much or how 
little you want of the framework preloaded…


From: Adam Prime 
Sent: 07 February 2021 13:45
To: Steven Haigh 
Cc: James Smith ; modperl@perl.apache.org
Subject: Re: [EXT]

There is one other thing you can do relatively easily that may get you a 
marginal gain when Apache spins up new children. Load some or all of your Perl 
dependencies before Apache forks. There is also an opportunity to load other 
static resources at this time, but that can get a little more involved and/or 
confusing. There is some documentation here:

https://perl.apache.org/docs/2.0/user/handlers/server.html#Startup_File 
[perl.apache.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__perl.apache.org_docs_2.0_user_handlers_server.html-23Startup-5FFile&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=kwRJ9OU2GaHG3eTqlSv3NGFCM8bbDVVw6VWYGivfszw&s=sT5APUQZ6MWHqhwLVLrTEK4xPf6OEANKV0EePFJ1L6Y&e=>

Adam



On Feb 7, 2021, at 8:15 AM, Steven Haigh 
mailto:net...@crc.id.au>> wrote:

In fact, I just realised that 'ab' test is rather restrictive So here's a 
bit more of an extended test:

# ab -k -n 1000 -c 32

Apache + ExecCGI:
Requests per second:14.26 [#/sec] (mean)
Time per request:   2244.181 [ms] (mean)
Time per request:   70.131 [ms] (mean, across all concurrent requests)

Apache + mod_perl (ModPerl::PerlRegistry):
Requests per second: 132.14 [#/sec] (mean)
Time per request:   242.175 [ms] (mean)
Time per request:   7.568 [ms] (mean, across all concurrent requests)

Interestingly, without Keepalives, the story is much the same:

# ab -n 1000 -c 32

Apache + ExecCGI:
Requests per second:14.15 [#/sec] (mean)
Time per request:   2260.875 [ms] (mean)
Time per request:   70.652 [ms] (mean, across all concurrent requests)

Apache + mod_perl (ModPerl::PerlRegistry):
Requests per second:154.48 [#/sec] (mean)
Time per request:   207.140 [ms] (mean)
Time per request:   6.473 [ms] (mean, across all concurrent requests)

Running some benchmarks across various parts of my site made me realise I also 
had some RewriteRules in the apache config that still had H=cgi-script - 
changed those to H=perl-script and saw similar improvements:

ExecCGI - Requests per second:11.84 [#/sec] (mean)
mod_perl - Requests per second:130.97 [#/sec] (mean)

That's quite some gains for a days work.

--
Steven Haigh 📧 net...@crc.id.au<mailto:net...@crc.id.au> 💻 
https://www.crc.id.au 
[crc.id.au]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.crc.id.au_&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=kwRJ9OU2GaHG3eTqlSv3NGFCM8bbDVVw6VWYGivfszw&s=xsyidls-ez8g3gIshJPO4FIGzGjiIp_qnzQ1yujeNK4&e=>

On Sun, Feb 7, 2021 at 23:58, Steven Haigh 
mailto:net...@crc.id.au>> wrote:

Interestingly, I did get things working with ModPerl::PerlRegistry.

What I couldn't find *anywhere* is that the data I was loading in Template 
Toolkit was included in the file in the __DATA__ area - which causes mod_perl 
to fall over!

The only way I managed to find this was the following error in the *system* 
/var/log/httpd/error_log (didn't show up in the vhost error_log!):
readline() on unopened filehandle DATA at 
/usr/lib64/perl5/vendor_perl/Template/Provider.pm line 638.

Took me a LONG time to find a vague post that reading in lines from  
kills mod_perl. Not sure why - but I stripped all the templates out and put 
them in a file instead and re-wrote that bit of code, and things started 
working.

I had to fix a few lib path issues, but after getting my head around that, most 
things seem to work as before - however I don't notice much of an improvement 
in execution times, I do see this improvement using 'ab -n 100 -c32':

Apache + ExecCGI: Requests per second:13.50 [#/sec] (mean)
Apache + mod_perl: Requests per second:59.81 [#/sec] (mean)

This is obviously a good thing.

I haven't gotten into the preload or DBI sharing yet - as that'll end up 
needing a bit of a rewrite of code to take advantage of. I'd be open to 
suggestions here from those who have done it in the past to save me going down 
some dead ends :D


--
Steven Haigh 📧 net...@crc.id.au<mailto:net...@crc.id.au> 💻 
https://www.crc.id.au 
[crc.id.au]<https://urldefense.proofpoint.com/v2/url?u=https

RE: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

2021-02-07 Thread James Smith

As welsey said – try Registry, that was the standard way of using mod_perl to 
cache perl in the server  – but your problem might be due to the note in 
PerlRun…

https://perl.apache.org/docs/2.0/api/ModPerl/PerlRun.html#Description
META: document that for now we don't chdir() into the script's dir, because it 
affects the whole process under threads. 
ModPerl::PerlRunPrefork
 should be used by those who run only under prefork MPM.
{tbh most people don’t use mod perl under threads anyway as there isn’t really 
a gain from using them}

It suggests you use ModPerl/PerlRunPrefork – as this does an additional step to 
cd to the script directory – which might be your issue….

From: Steven Haigh 
Sent: 07 February 2021 01:00
To: modperl@perl.apache.org
Subject: Moving ExecCGI to mod_perl - performance and custom 'modules' [EXT]

Hi all,

So for many years I've been slack and writing perl scripts to do various things 
- but never needed more than the normal apache +ExecCGI and Template Toolkit.

One of my sites has become a bit more popular, so I'd like to spend a bit of 
time on performance. Currently, I'm seeing ~300-400ms of what I believe to be 
execution time of the script loading, running, and then blatting its output to 
STDOUT and the browser can go do its thing.

I believe most of the delay would be to do with loading perl, its modules etc 
etc

I know that the current trend would be to re-write the entire site in a more 
modern, daemon based solution - and I started down the Mojolicious path - but 
the amount of re-writing to save 1/3rd of a second seems to be excessive

Would I be correct in thinking that mod_perl would help in this case?

I did try a basic test, but I have a 'use functions' in all my scripts that 
loads a .pm with some global vars and a lot of common subs - and for whatever 
reason (can't find anything on Google as to why), none of the subs are 
recognised in the main script when loaded via ModPerl::PerlRun.

So throwing it out to the list - am I on the right track? wasting my time? or 
just a simple mistake?

--
Steven Haigh 📧 net...@crc.id.au 💻 
https://www.crc.id.au 
[crc.id.au]



-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Confused about two development utils [EXT]

2020-12-23 Thread James Smith

We don’t use perl for everything, yes we use it for web data, yes we still use 
it as the glue language in a lot of cases, the most complex stuff is done with 
C (not even C++ as that is too slow). Others on site use Python, Java, Rust, 
Go, PHP, along with looking at using GPUs in cases where code can be highly 
parallelised

It is not just one application – but many, many applications… All with a common 
goal of understanding the human genome, and using it to assist in developing 
new understanding and techniques which can advance health care.

We are a very large sequencing centre (one of the largest in the world) – what 
I was pointing out is that you can’t just throw memory, CPUs, power at a 
problem – you have to think – how can I do what I need to do with the least 
resources. Rather than what resources can I throw at the problem.

Currently we are acting as the central repository for all COVID-19 sequencing 
in the UK, along with one of the largest “wet” labs sequencing data for it – 
and that is half the sequenced samples in the whole world. The UK is sequencing 
more COVID-19 genomes a day than most other countries have sequenced since the 
start of the pandemic in Feb/Mar. This has lead to us discovering a new more 
transmissible version of the virus, and it what part of the country the 
different strains are present – no other country in the world has the 
information, technology or infrastructure in place to achieve this.

But this is just a small part of the genomic sequencing we are looking at – we 
work on:
* other pathogens – e.g. Plasmodium (Malaria);
* cancer genomes (and how effective drugs are);
* are a major part of the Human Cell Atlas which is looking at how the 
expression of genes (in the simplest terms which ones are switched on and 
switched off) are different in different tissues;
* sequencing the genomes of other animals to understand their evolution;
* and looking at some other species in detail, to see what we can learn from 
them when they have defective genes;

Although all these are currently scaled back so that we can work relentlessly 
to support the medical teams and other researchers get on top of COVID-19.

What is interesting is that many of the developers we have on campus (well all 
wfh at the moment) are all (relatively) old as we learnt to develop code on 
machines with limited CPU and limited memory – so that things had to be 
efficient, had to be compact…. And that is as important now as it was 20 or 30 
years ago – the data we handle is going up faster than Moore’s Law! Many of us 
have pride in doing things as efficiently as possible.

It took around 10 years to sequence and assemble the first human genome {well 
we are still tinkering with it and filling in the gaps} – now at the institute 
we can sequence and assemble around 400 human genomes in a day – to the same 
quality!

So most of our issues are due to the scale of the problems we face – e.g. the 
human genome has 3 billion base-pairs (A, C, G, Ts) , so normal solutions don’t 
scale to that (once many years ago we looked at setting up an Oracle database 
where there was at least 1 row for every base pair – recording all variants 
(think of them as spelling mistakes, for example a T rather than an A, or an 
extra letter inserted or deleted) for that base pair… The schema was set up – 
and then they realised it would take 12 months to load the data which we had 
then (which is probably less than a millionth of what we have now)!

Moving compute off site is a problem as the transfer of the level of data we 
have would cause a problem – you can’t easily move all the data to the compute 
– so you have to bring the compute to the data.

The site I worked on before I became a more general developer was doing that – 
and the code that was written 12-15 years ago is actually still going strong – 
it has seen a few changes over the year – many displays have had to be 
redeveloped as the scale of the data has got so big that even the summary pages 
we produced 10 years ago have to be summarised because they are so large.


From: Mithun Bhattacharya 
Sent: 24 December 2020 00:06
To: mod_perl list 
Subject: Re: Confused about two development utils [EXT]

James would you be able to share more info about your setup ?
1. What exactly is your application doing which requires so much memory and CPU 
- is it something like gene splicing (no i don't know much about it beyond 
Jurassic Park :D )
2. Do you feel Perl was the best choice for whatever you are doing and if yes 
then why ? How much of your stuff is using mod_perl considering you mentioned 
not much is web related ?
3. What are the challenges you are currently facing with your implementation ?

On Wed, Dec 23, 2020 at 6:58 AM James Smith 
mailto:j...@sanger.ac.uk>> wrote:
Oh but memory is a problem – but not if you have just a small cluster of 
machines!

Our boxes are larger than that – but they all run virtual machine {only a small 
proportion

RE: Confused about two development utils [EXT]

2020-12-23 Thread James Smith

Oh but memory is a problem – but not if you have just a small cluster of 
machines!

Our boxes are larger than that – but they all run virtual machine {only a small 
proportion web related} – machines/memory would rapidly become in our data 
centre - we run VMWARE [995 hosts] and openstack [10,000s of hosts] + a 
selection of large memory machines {measured in TBs of memory per machine }.

We would be looking at somewhere between 0.5 PB and 1 PB of memory – not just 
the price of buying that amount of memory - for many machines we need the 
fastest memory money can buy for the workload, but we would need a lot more 
CPUs then we currently have as we would need a larger amount of machines to 
have 64GB virtual machines {we would get 2 VMs per host. We currently have 
approx. 1-2000 CPUs running our hardware (last time I had a figure) – it would 
probably need to go to approximately 5-10,000!
It is not just the initial outlay but the environmental and financial cost of 
running that number of machines, and finding space to run them without putting 
the cooling costs through the roof!! That is without considering what 
additional constraints on storage having the extra machines may have (at the 
last count a year ago we had over 30 PBytes of storage on side – and a large 
amount of offsite backup.

We would also stretch the amount of power we can get from the national grid to 
power it all - we currently have 3 feeds from different part of the national 
grid (we are fortunately in position where this is possible) and the dedicated 
link we would need to add more power would be at least 50 miles long!

So - managing cores/memory is vitally important to us – moving to the cloud is 
an option we are looking at – but that is more than 4 times the price of our 
onsite set-up (with substantial discounts from AWS) and would require an 
upgrade of our existing link to the internet – which is currently 40Gbit of 
data (I think).

Currently we are analysing a very large amounts of data directly linked to the 
current major world problem – this is why the UK is currently being isolated as 
we have discovered and can track a new strain, in near real time – other 
countries have no ability to do this – we in a day can and do handle, sequence 
and analyse more samples than the whole of France has sequenced since February. 
We probably don’t have more of the new variant strain than in other areas of 
the world – it is just that we know we have because of the amount of sequencing 
and analysis that we in the UK have done.

From: Matthias Peng 
Sent: 23 December 2020 12:02
To: mod_perl list 
Subject: Re: Confused about two development utils [EXT]

Today memory is not serious problem, each of our server has 64GB memory.


Forgot to add - so our FCGI servers need a lot (and I mean a lot) more memory 
than the mod_perl servers to serve the same level of content (just in case 
memory blows up with FCGI backends)

-Original Message-
From: James Smith mailto:j...@sanger.ac.uk>>
Sent: 23 December 2020 11:34
To: André Warnier (tomcat/perl) mailto:a...@ice-sa.com>>; 
modperl@perl.apache.org<mailto:modperl@perl.apache.org>
Subject: RE: Confused about two development utils [EXT]


> This costs memory, and all the more since many perl modules are not 
> thread-safe, so if you use them in your code, at this moment the only safe 
> way to do it is to use the Apache httpd prefork model. This means that each 
> Apache httpd child process has its own copy of the perl interpreter, which 
> means that the memory used by this embedded perl interpreter has to be 
> counted n times (as many times as there are Apache httpd child processes 
> running at any one time).

This isn’t quite true - if you load modules before the process forks then they 
can cleverly share the same parts of memory. It is useful to be able to 
"pre-load" core functionality which is used across all functions {this is the 
case in Linux anyway}. It also speeds up child process generation as the 
modules are already in memory and converted to byte code.

One of the great advantages of mod_perl is Apache2::SizeLimit which can blow 
away large child process - and then if needed create new ones. This is not the 
case with some of the FCGI solutions as the individual processes can grow if 
there is a memory leak or a request that retrieves a large amount of content 
(even if not served), but perl can't give the memory back. So FCGI processes 
only get bigger and bigger and eventually blow up memory (or hit swap first)





--
 The Wellcome Sanger Institute is operated by Genome Research  Limited, a 
charity registered in England with number 1021457 and a  company registered in 
England with number 2742969, whose registered  office is 215 Euston Road, 
London, NW1 2 
[google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLo

RE: Confused about two development utils [EXT]

2020-12-23 Thread James Smith

Forgot to add - so our FCGI servers need a lot (and I mean a lot) more memory 
than the mod_perl servers to serve the same level of content (just in case 
memory blows up with FCGI backends)

-Original Message-
From: James Smith  
Sent: 23 December 2020 11:34
To: André Warnier (tomcat/perl) ; modperl@perl.apache.org
Subject: RE: Confused about two development utils [EXT]

> This costs memory, and all the more since many perl modules are not 
> thread-safe, so if you use them in your code, at this moment the only safe 
> way to do it is to use the Apache httpd prefork model. This means that each 
> Apache httpd child process has its own copy of the perl interpreter, which 
> means that the memory used by this embedded perl interpreter has to be 
> counted n times (as many times as there are Apache httpd child processes 
> running at any one time).

This isn’t quite true - if you load modules before the process forks then they 
can cleverly share the same parts of memory. It is useful to be able to 
"pre-load" core functionality which is used across all functions {this is the 
case in Linux anyway}. It also speeds up child process generation as the 
modules are already in memory and converted to byte code.

One of the great advantages of mod_perl is Apache2::SizeLimit which can blow 
away large child process - and then if needed create new ones. This is not the 
case with some of the FCGI solutions as the individual processes can grow if 
there is a memory leak or a request that retrieves a large amount of content 
(even if not served), but perl can't give the memory back. So FCGI processes 
only get bigger and bigger and eventually blow up memory (or hit swap first) 

--
 The Wellcome Sanger Institute is operated by Genome Research  Limited, a 
charity registered in England with number 1021457 and a  company registered in 
England with number 2742969, whose registered  office is 215 Euston Road, 
London, NW1 2BE.

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Confused about two development utils [EXT]

2020-12-23 Thread James Smith


> This costs memory, and all the more since many perl modules are not 
> thread-safe, so if you use them in your code, at this moment the only safe 
> way to do it is to use the Apache httpd prefork model. This means that each 
> Apache httpd child process has its own copy of the perl interpreter, which 
> means that the memory used by this embedded perl interpreter has to be 
> counted n times (as many times as there are Apache httpd child processes 
> running at any one time).

This isn’t quite true - if you load modules before the process forks then they 
can cleverly share the same parts of memory. It is useful to be able to 
"pre-load" core functionality which is used across all functions {this is the 
case in Linux anyway}. It also speeds up child process generation as the 
modules are already in memory and converted to byte code.

One of the great advantages of mod_perl is Apache2::SizeLimit which can blow 
away large child process - and then if needed create new ones. This is not the 
case with some of the FCGI solutions as the individual processes can grow if 
there is a memory leak or a request that retrieves a large amount of content 
(even if not served), but perl can't give the memory back. So FCGI processes 
only get bigger and bigger and eventually blow up memory (or hit swap first) 





-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: suggestions for perl as web development language [EXT]

2020-12-22 Thread James Smith

There are not many applications which really benefit from multiple threads in 
web server environments unless you have very low load – as they are only 
efficient as they can use multiple cores, so you need stupidly speced machines 
to manage this load properly, and if you are using external resources can often 
be blocking anyway.

Yes mod_perl doesn’t do web-sockets – but usually that isn’t an issue unless 
you have the need to push when people update content and you want it 
immediately available. There are others issues to consider with web-sockets 
e.g. port usage to keep the connections open. If you don’t need that immediate 
response – polling can achieve the desired effect – and Apache can even tell 
the client how long you need to wait before you send another connection.

From: John Dunlap 
Sent: 22 December 2020 13:35
To: Vincent Veyron 
Cc: mod_perl list 
Subject: Re: suggestions for perl as web development language [EXT]

mod_perl is horribly inefficient because prefork is inefficient and because 
each request is single threaded. In addition to this, mod_perl also cannot 
provide websockets which are essential in a modern application.

On Mon, Dec 21, 2020 at 1:26 AM Vincent Veyron 
mailto:vv.li...@wanadoo.fr>> wrote:

[You forgot to cc the list ]

On Sun, 20 Dec 2020 23:16:03 -0500
John Dunlap mailto:j...@lariat.co>> wrote:

> We run 20 customers on a single box and our database has approximately 500
> tables. We run hundreds or thousands of queries per second.
>

500 tables is a lot more than what I typically handle. I'm sure it complicates 
things.

But see this post by James Smith in a recent thread :

http://mail-archives.apache.org/mod_mbox/perl-modperl/202008.mbox/ajax/%3Cef383804cf394c53b48258531891d12b%40sanger.ac.uk%3E

[mail-archives.apache.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_perl-2Dmodperl_202008.mbox_ajax_-253Cef383804cf394c53b48258531891d12b-2540sanger.ac.uk-253E&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=d_ZhFPlvaXw1KbAmeExguXy1fSbpEKzIuAPZMOJ9h78&s=nK2GCkR9GuTuv3TffZBjRZgIDya7tetj5oHLLDBsn18&e=>

Easier to read in this archive :

http://mail-archives.apache.org/mod_mbox/perl-modperl/202008.mbox/browser 
[mail-archives.apache.org]<https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_perl-2Dmodperl_202008.mbox_browser&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=d_ZhFPlvaXw1KbAmeExguXy1fSbpEKzIuAPZMOJ9h78&s=Om-DkyClcEdOiq5HBizz-ydRhOziZbAP-AL_sR0eXAE&e=>

I also remember a post by a chinese guy who handled the same order of database 
size, in which he wrote that he had compared several frameworks and mod_perl 
was the fastest; but that was something like 10 years ago, and I can't find it 
anymore.

So I'm not sure how mod_perl could handle that kind of load and be horribly 
inefficient?

(I forgot to say in my previous post that over 50% of the time used by my 
script is spent on the _one_ query out of 120 that writes a smallish session 
hash to disk)

--
Bien à vous, Vincent Veyron

https://compta.libremen.com 
[compta.libremen.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=d_ZhFPlvaXw1KbAmeExguXy1fSbpEKzIuAPZMOJ9h78&s=mT-OlBL98gDcEsBMBidxank1GbEXq2zX6zPGOmpwobU&e=>
Logiciel libre de comptabilité générale en partie double

--
John Dunlap
CTO | Lariat

Direct:
j...@lariat.co<mailto:j...@lariat.co>

Customer Service:
877.268.6667
supp...@lariat.co<mailto:supp...@lariat.co>
[cid:image001.png@01D6D86F.18E6D9C0]

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Don't use session hashes [EXT]

2020-12-21 Thread James Smith

What I'm trying to say - yes use session IDs if you need to - but don't if you 
don't I see lots of PHP sites which create sessions for every visitor - and 
this tends to create a mass of 1 page sessions that never ever get used. So you 
should only create them on the first instance that you need to. Use you can 
create session IDs if you want to - but don't write to the database unless you 
have to...

If you need personalisation {UI tweaks} - store it directly in a cookie {you 
can sign it if you want for security reasons} - e.g only create a proper 
session if the user is logging in - or creating a "shopping cart" {in the 
loosest terms}. It take a huge load off the file system / database.

-Original Message-
From: Vincent Veyron  
Sent: 21 December 2020 13:51
To: modperl@perl.apache.org
Cc: James Smith 
Subject: Don't use session hashes [EXT]

On Mon, 21 Dec 2020 12:15:45 +
James Smith  wrote:

Hi James,

> 
> The first rule of session hashes is don't use session hashes, 

I thought this was a standard way of storing user information? I've copied an 
example at the bottom (this application calculates quotes for a sailmaker)

I'm not sure what else I could use?

>but the 2nd rule of session hashes is don't write them to disk - that is 
>really inefficient. Look at using something like MySQL,

Actually, I serialize the data and store it into a Postgresql table. This is 
what I meant by writing to disk. I suppose some caching by Postgresql helps 
here?

> memcached, redis, ... to >store them instead - whatever you do - just avoid 
> writing to disk!

I'll have a look, thanks

$VAR1 = {
  'dump' => 1,
  'base_currency' => 'EUR',
  'username' => 'franck-...@orange.fr',
  '_session_id' => 'Yd7Oif8hi7Xb91cyoGVK3nS7WkEEt4d0',
  'devis' => {
   'type_bateau' => '630 Q',
   'variante_for' => undef,
   'margin_rate_localized' => '',
   'option' => {
 'clew_block_cost_localized' => '   0,00',
 'lattes_forcees' => '2',
 'lattes_cost' => 
'44.457000',
 'ris' => '2',
 'boom_cover_cost' => '0',
 'clew_block_unit_price' => '0',
 'overhead_leech_line_cost' => '0',
 'bande_anti_uv_cost' => '0',
 'lattes_cost_localized' => '44,46',
 'bande_anti_uv_cost_localized' => '0',
 'clew_block' => 'Aucun',
 'chariot_unit_price' => '0',
 'cunningham' => 'False',
 'cunningham_cost_localized' => '0',
 'luff_foam_cost' => '0',
 'boom_cover_cost_localized' => '0',
 'luff_foam_cost_localized' => '0',
 'boom_cover' => 'False',
 'ris_cost_localized' => '240,00',
 'protection' => 'Aucune',
 'total_option_cost_localized' => '379,07',
 'two_ply_leech_cost' => '94.613675',
 'cunningham_cost' => '0',
 'clew_block_cost' => '0',
 'boitier_chute' => 'Standard',
 'ris_chute' => 'Poulie small',
 'batten_shape' => 'ORC VINYLESTER',
 'two_ply_leech_cost_localized' => '94,61',
 'bande_anti_uv' => 'Aucune',
 'batten_sha

RE: suggestions for perl as web development language [EXT]

2020-12-21 Thread James Smith

> (I forgot to say in my previous post that over 50% of the time used by my 
> script is spent on the _one_ query out of 120 that writes a smallish session 
> hash to disk)

The first rule of session hashes is don't use session hashes, but the 2nd rule 
of session hashes is don't write them to disk - that is really inefficient. 
Look at using something like MySQL, memcached, redis, ... to store them instead 
- whatever you do - just avoid writing to disk!

-Original Message-
From: Vincent Veyron  
Sent: 21 December 2020 07:27
To: John Dunlap 
Cc: modperl@perl.apache.org
Subject: Re: suggestions for perl as web development language [EXT]


[You forgot to cc the list ]

On Sun, 20 Dec 2020 23:16:03 -0500
John Dunlap  wrote:

> We run 20 customers on a single box and our database has approximately 
> 500 tables. We run hundreds or thousands of queries per second.
> 

500 tables is a lot more than what I typically handle. I'm sure it complicates 
things.

But see this post by James Smith in a recent thread :

https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_perl-2Dmodperl_202008.mbox_ajax_-253Cef383804cf394c53b48258531891d12b-2540sanger.ac.uk-253E&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=wOEBUDMfU7L3Uq4Q7pzEIfQ0Qfh1jaa-dmtnINBi3sM&s=fztwZzN5yK5qT6hb4WlQYTaEBqRmSv7Pj7v_o6WmyfM&e=
 

Easier to read in this archive :

https://urldefense.proofpoint.com/v2/url?u=http-3A__mail-2Darchives.apache.org_mod-5Fmbox_perl-2Dmodperl_202008.mbox_browser&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=wOEBUDMfU7L3Uq4Q7pzEIfQ0Qfh1jaa-dmtnINBi3sM&s=3saqpwQfa8fOf9eiynEa-w3JsT-tenD-pyUc3vfFhwc&e=
 

I also remember a post by a chinese guy who handled the same order of database 
size, in which he wrote that he had compared several frameworks and mod_perl 
was the fastest; but that was something like 10 years ago, and I can't find it 
anymore.

So I'm not sure how mod_perl could handle that kind of load and be horribly 
inefficient?


-- 
Bien à vous, Vincent Veyron 

https://urldefense.proofpoint.com/v2/url?u=https-3A__compta.libremen.com&d=DwIFAw&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=wOEBUDMfU7L3Uq4Q7pzEIfQ0Qfh1jaa-dmtnINBi3sM&s=yvWfWmnBK99th1DSC5sQcjzdWwA6QZRGXpO2R5H1414&e=
Logiciel libre de comptabilité générale en partie double


--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

RE: suggestions for perl as web development language [EXT]

2020-12-21 Thread James Smith

Didn’t see the earlier response – John if you are seeing 25% cpu utilization 
that indicates that something is wrong with architecture of the solution rather 
than the language.

It would suggest that you have bottlenecks elsewhere – network, memory, 
database, disk. We have seen that the sweet spot of well designed servers is 
somewhere between 4 and 8 CPUs and 8-16G RAM, after that the performance gain 
is not so good – not because of the language – but because of all the other 
constraints on the system - wanting to databases/disk/network etc. We have HPC 
compute clusters at work – and to really make use of the large CPU/memory 
machines (last time I checked it was around 200 cores and 1Tb RAM) not just the 
coding but the underlying algorithm has to suit parallelisation, and care has 
to be taken to avoid writing to disk at any point in the operations.

From: Mithun Bhattacharya 
Sent: 20 December 2020 22:29
To: mod_perl list 
Subject: Re: suggestions for perl as web development language [EXT]

Running individual functions in independent threads can't be a solution for 
performance optimization - at least I have never heard such a thing, maybe 
others can pitch in.

On Sun, Dec 20, 2020 at 4:15 PM John Dunlap 
mailto:j...@lariat.co>> wrote:
I have no doubt that our application can be optimized. We do so whenever we 
encounter poor performance and we will continue to do so. The point is that 
Perl didn't do a lot to help us in this regard. Languages like elixir use 
immutable data structures and will automatically run individual function calls 
in separate threads. Perl, by contrast, will only have a single thread per 
request.

On Sun, Dec 20, 2020, 3:33 PM Mithun Bhattacharya 
mailto:mit...@gmail.com>> wrote:
You would have to define poor system performance - are you doing anything cpu 
intensive at all ? Maybe your RAM is being the bottleneck ?

On Sun, Dec 20, 2020 at 2:28 PM John Dunlap 
mailto:j...@lariat.co>> wrote:
It's extremely inefficient by comparison. We host our application on beefy 
servers with 32 cores and 64G of ram and I commonly see poor system performance 
with less than 25% cpu utilization.

On Sun, Dec 20, 2020, 2:22 PM Mithun Bhattacharya 
mailto:mit...@gmail.com>> wrote:
Agreed prefork is recommended but what is the problem with that ?

On Sun, Dec 20, 2020 at 12:47 PM John Dunlap 
mailto:j...@lariat.co>> wrote:
Our app segfaults at random of we use anything other than prefork.

On Sun, Dec 20, 2020, 1:32 PM Mithun Bhattacharya 
mailto:mit...@gmail.com>> wrote:
I am confused - you like threads so Perl is bad ? I am very happy forking away 
and yes I work a lot with non thread safe DBI connections without any issues.

On Sun, Dec 20, 2020 at 11:53 AM John Dunlap 
mailto:j...@lariat.co>> wrote:
In my opinion, no one should build new projects in Perl. The world is 
increasingly trending towards parallelism and higher numbers of cpu cores and 
Perl is poorly positioned to leverage these advancements. Many of Perl's 
dependencies are not thread safe and mod_perl forces you to use mpm_prefork. My 
organization has started moving away from Perl to Elixir for these reasons.

On Tue, Aug 4, 2020, 3:37 AM James Smith 
mailto:j...@sanger.ac.uk>> wrote:
Perl is a great solution for web development.

Others will disagree but the best way I still believe is using mod_perl - but 
only if you use it's full power - and you probably need a special sort of mind 
set to use - but that can be said for any language.

From experience - it may be fractionally slower than small "standalone" apps 
that dancer etc are good at, but it is (a) much, much more stable {dancer etc 
does not cope well with either large requests or lots of small requests}, and 
(b) if you have a large code base and/or a large number of services then it 
generally uses much less compute power than the others {can easily handle 
multiple services on a single apache instance}

Where it really gains is the hooks into the apache process - being able to add 
functionality easily at any stage in the request process, from path 
translation, AAA stages, pre-processing, to post-processing and logging, and 
also to interact with other languages at any stage - e.g. can handle 
pre-processing & post-processing around a script written in another language 
(e.g. PHP, Java) or produced by another webserver integrated by mod_proxy.

It isn't really a framework though like dancer or mojolicious and thus has its 
own advantages and disadvantages.

You would to some extent have to roll your own code to produce the pages 
themselves although there are libraries out there to do lots of it.

We have an in house library whose embryonic stages were written over 20 years 
ago - and has now been stable for around 12-13 years and works strong...

James

-Original Message-
From: Wesley Peng mailto:m...@yonghua.org>>
Sent: 04 August 2020 06:4

RE: suggestions for perl as web development language [EXT]

2020-12-20 Thread James Smith

There are cases where Plack though isn't the solution and where mod_perl 
written well is a far better (more stable) solution.

It is good when the backend servers are slow (simple not complex app); backend 
requests are relatively fast, and don't use much memory.

But the warning

(1) If you have large numbers of small apps on a domain (a couple we have have 
over 60 admin apps under a single domain) or a large single app code base - but 
where many of the larger requests are hardly used; the ability to choose which 
perl is cached in shared memory and which is loaded when required is much 
simpler;
(2) Large code bases can also lead to very slow start-up times;
(3) If there are possibilities of large/slow requests - apache's dynamic nature 
is better and handling these and then clearing memory - issues with each Plack 
process keeping large amounts of memory and difficulty in culling/restarting 
individual Plack children; then handling load efficiently across multiple 
machines as the front end proxies - have difficulty handling load balancing in 
this case;

Note I work on a number of projects where the data is relatively large (some 
including many billions of rows of (closely related) entries)

-Original Message-
From: Steven Lembark  
Sent: 20 December 2020 15:31
To: modperl@perl.apache.org
Cc: lemb...@wrkhors.com
Subject: Re: suggestions for perl as web development language [EXT]

On Tue, 4 Aug 2020 19:59:01 -0500
Mithun Bhattacharya  wrote:

> The question is move off to what ? I don't see alternatives being 
> shared which blows an apache+mod_perl setup out of the water.

(Sorry for being late on this...)

There are a variety of servers using Plack which can handle heavy loads and are 
both better documented and easier to manage than Apache. You can see a list at:

One big advantage to Plack is *not* having to become a walking encyclopedia of 
Apache2 internals. Shoving structs around was the only way we knew in the 80's, 
mod_perl was just an extension of "pass a struct" and keep going. Plack 
provides an abstraction that at least I find simpler to program with and things 
like Dancer2 give you the opportunity to munge the incoming request in all 
sorts of ways to handle messy situations. Beyond that take a look at the 
servers listed on Plack's website.

--
Steven Lembark
Workhorse Computing
lemb...@wrkhors.com
+1 888 359 3508

--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.

RE: cache a object in modperl [EXT]

2020-09-16 Thread James Smith

You can still have an always up service – but it will require a bit of work and 
a load balancing proxy set up in front of multiple apache instances. You can 
then restart each backend independently without an issue.

If the apaches are relatively lightweight you can run two on the same machine 
(e.g. on ports 8000 & 8001) and another one set up as a proxy on 80/443 which 
proxies back to these two.

I use this for a dev/live setup on a VM – where 8000 is live and 8001 is dev – 
and a lightweight apache proxies back to the other two…

From: Mithun Bhattacharya 
Sent: 14 September 2020 06:49
To: mod_perl list 
Subject: Re: cache a object in modperl [EXT]

Haha I can't answer that - I work with systems which are always up. We have 
users working across the globe so there is no non-active time.

In my case I would have to throw an independent cache (my current choice is 
REDIS but you could chose a DB_File for all I know) and refresh it as needed - 
IANA I could hit every 30 min to check for update :)

On Mon, Sep 14, 2020 at 12:44 AM Wesley Peng 
mailto:wp...@pobox.com>> wrote:

Mithun Bhattacharya wrote:
> Does IANA have an easy way of determining whether there is an update
> since a certain date ? I was thinking it might make sense to just run a
> scheduled job to monitor for update and then restart your service or
> refresh your local cache depending upon how you solve it.

Yes I agree with this.
I may monitor IANA's database via their version changes, and run a
crontab to restart my apache server during the non-active user time
(i.e, 3:00 AM).

Or do you have better solution?
Thanks.

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: Question about deployment of math computing [EXT]

2020-08-05 Thread James Smith

Wesley,

You will have seen my posts elsewhere - we work on large Terra/Peta byte scale 
datasets {and these aren't a large number of large records but more a very, 
very large number of small records} so the memory and response times are both 
large - less so compute in some cases but not others.

The services which use apache/mod_perl work reliably and return data for these 
- the dancer/starman sometimes fail/hang as there are no backends to serve the 
requests or those backends timeout requests to the nginx/proxy (but still 
continue using resources). The team running the backends fail to notice this - 
because there is no easy to see reporting etc on these boxes.

We do have other services which we have set up which return large amounts of 
data computed on the fly and the response time for these could be multiple 
hours - but by carefully streaming the data in apache we can get the data to 
return. A similar option isn't available in dancer (or wasn't at the time) to 
handle these sorts of requests and so similar code was impossible.

In most cases starman hasn't really been the answer and apache works 
sufficiently well. Even where people are using nginx we are often now using 
some of the alternative apache workers (mpm_event) which seem to be better/more 
reliable than nginx, and means we don't have to have completely different 
configuration setups for some of our proxies, static servers and dynamic 
content servers.

The good thing about Apache is it's dynamic rescaling - which isn't as easy 
with starman - if you have a large code base the spin up time for starman can 
be quite large as it appears (to make it efficient) load in every bit of code 
that the application needs - even if it is one of those small edge cases.

So yes use starman for simple apps if you need to, but for complex stuff I find 
mod_perl setup more reliable.

James

-Original Message-
From: Wesley Peng  
Sent: 05 August 2020 04:31
To: dc...@prosentient.com.au; modperl@perl.apache.org
Subject: Re: Question about deployment of math computing [EXT]

Hi

dc...@prosentient.com.au wrote:
> That's interesting. After re-reading your earlier email, I think that I 
> misunderstood what you were saying.
> 
> Since this is a mod_perl listserv, I imagine that the advice will always be 
> to use mod_perl rather than starman?
> 
> Personally, I'd say either option would be fine. In my experience, the key 
> advantage of mod_perl or starman (say over CGI) is that you can pre-load 
> libraries into memory at web server startup time, and that processes are 
> persistent (although they do have limited lifetimes of course).
> 
> You could use a framework like Catalyst or Mojolicious (note Dancer is 
> another framework, but I haven't worked with it) which can support different 
> web servers, and then try the different options to see what suits you best.
> 
> One thing to note would be that usually people put a reverse proxy in front 
> of starman like Apache or Nginx (partially for serving static assets but 
> other reasons as well). Your stack could be less complicated if you just went 
> the mod_perl/Apache route.
> 
> That said, what OS are you planning to use? It's worth checking if mod_perl 
> is easily available in your target OS's package repositories. I think Red Hat 
> dropped mod_perl starting with RHEL 8, although EPEL 8 now has mod_perl in 
> it. Something to think about.

We use ubuntu 16.04 and 18.04.

We do use dancer/starman in product env, but the service only handle light 
weight API requests, for example, a restful api for data validation.

While our math computing is heavy weight service, each request will take a lot 
time to finish, so I think should it be deployed in dancer?

Since the webserver behind dancer is starman by default, starman is event 
driven, it uses very few processes ,and the process can't scale up/down 
automatically.

We deploy starman with 5 processes by default. when 5 requests coming, all 5 
starman processes are so busy to compute them, so the next request will be 
blocked. is it?

But apache mp is working as prefork way, generally it can have as many as 
thousands of processes if the resource is permitted. And the process management 
can scale up/down the children automatically.

So my real question is, for a CPU consuming service, the event driven service 
like starman, has no advantage than preforked service like Apache.

Am I right?

Thanks.

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: suggestions for perl as web development language [EXT]

2020-08-04 Thread James Smith

Don’t talk to me about nginx/starman – it results in most of the errors with 
concurrency issues we see where I work – but doesn’t report the issues! We just 
see them when users can’t get responses.

The code written in mod_perl had no issues with about 25% of the resources. 
Starman fails under a large numbers of concurrent connections; admittedly some 
of requests can take seconds if not minutes to return (querying terra & peta 
byte scale databases) so it isn’t difficult to generate lots of concurrent 
queries.

From: Mark Blackman 
Sent: 04 August 2020 22:05
To: Mithun Bhattacharya 
Cc: Joseph He ; James Smith ; John 
Dunlap ; Wesley Peng ; mod_perl list 

Subject: Re: suggestions for perl as web development language [EXT]

On 4 Aug 2020, at 21:55, Mithun Bhattacharya 
mailto:mit...@gmail.com>> wrote:

Ours is a REST based service so every request has business logic and an 
apache+mod_perl instance actually has a better segregation of the webserver and 
Perl code - we don't worry about handling the HTTP request and managing 
children. We trust Apache will do the right thing and if something breaks we 
have a large community of people who can help. All we worry about is our 
business logic which well no one can help if we don't know what we have coded :)

Would you like to share a Perl based webserver which can be guaranteed to be 
comparable to apache in terms of reliability and stability ?

On Tue, Aug 4, 2020 at 3:48 PM Mark Blackman 
mailto:m...@blackmans.org>> wrote:

On 4 Aug 2020, at 21:41, Mithun Bhattacharya 
mailto:mit...@gmail.com>> wrote:

I am genuinely curious what are these other "well known" means ?

On Tue, Aug 4, 2020 at 3:37 PM Mark Blackman 
mailto:m...@blackmans.org>> wrote:

> On 4 Aug 2020, at 17:58, Mithun Bhattacharya 
> mailto:mit...@gmail.com>> wrote:
>
> mod_perl does have value because it does a more efficient utilization of 
> resources - this is important when fast response time and scalability is 
> important. The complexity is a known problem but it is not a mystery box 
> either - there is enough documentation which explains what has to happen and 
> what could have gone wrong.

mod_perl’s relative efficiency can be achieved by other well-known means.

That would depend on what you mean by  "efficient utilisation of resources”.  
You can get the same general effect, more simply, by running a high-performing 
pre-forking Perl web application server and a web server with a simple 
configuration in front of it ,instead of a complicated Apache+mod_perl 
installation.

That also buys you a nice separation of concerns, the web server handles all 
the complicated host or path rewrites and access control and the Perl app 
focuses on responding to the, now-sanitised, fully normalized, HTTP requests.

- Mark

You would still have something like Apache or Nginx handling the direct 
connection to the client and after all clean-up/rewrite/ACL logic is applied, 
then the HTTP request is passed onto something like 
https://metacpan.org/pod/Starman 
[metacpan.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__metacpan.org_pod_Starman&d=DwMFaQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=9Iq3_XmWnDBY_X_sLWU5TBxlUOACnWsTzq5FSwl4lps&s=-fjwi06iac3C4zq2yJFdm8lAb8927XTfCwOHbFBwvzk&e=>

- Mark

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: suggestions for perl as web development language [EXT]

2020-08-04 Thread James Smith

From: John Dunlap 
Sent: 04 August 2020 15:30
To: Wesley Peng 
Cc: mod_perl list 
Subject: Re: suggestions for perl as web development language [EXT]

The fundamental and, in my opinion, fatal flaws of mod_per are as follows:
> 1) Concurrency. mod_perl is pretty close to forced to use mpm_prefork because 
> very few perl dependencies are thread safe.

Concurrency in extreme conditions - is actually better when it comes to 
mod_perl than a number of other solutions – e.g. nginx/starman. Apache/mod_perl 
is much better at handling large numbers of simultaneous requests than the 
systems which fork a number of small processes at start up to handle requests. 
You either have to fork a large number of these or pray you don’t get large 
numbers of simultaneous requests. Some of our systems have long return times 
for queries due to the terra/petabyte scale of some of our backend servers.

> 2) mod_perl cannot provide web sockets.

True – we haven’t really found an excuse for web-sockets although our front end 
“Application Delivery Controller” (which sits in the DMZ) can manage proxying 
requests that need sockets one way and others that don’t another way.
There are still a lot of issues with web-sockets – due to not all proxies 
handling these requests and so have to limit their use in a lot our cases [ a 
lot of our users are on networks that sit behind proxy/cache servers ]

> Due to these reasons, my organization has started looking at ways to move 
> away from mod_perl.

We are using more off the shelf packages for some of our applications – e.g. 
Wordpress as a CMS/Object manager, and yes we are also moving to more front-end 
centric applications. But many of our fundamental pieces of code are still 
working in Apache/mod_perl as it is a better, more-reliable language to work 
with.

On Tue, Aug 4, 2020 at 5:43 AM Wesley Peng 
mailto:m...@yonghua.org>> wrote:
greetings,

My team use all of perl, ruby, python for scripting stuff.
perl is stronger for system admin tasks, and data analysis etc.
But for web development, it seems to be not as popular as others.
It has less selective frameworks, and even we can't get the right people
to do the webdev job with perl.
Do you think in today we will give up perl/modperl as web development
language, and choose the alternatives instead?

Thanks & Regards

--
John Dunlap
CTO | Lariat

Direct:
j...@lariat.co

Customer Service:
877.268.6667
supp...@lariat.co
[cid:image001.png@01D66A7C.046364C0]

-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

RE: suggestions for perl as web development language [EXT]

2020-08-04 Thread James Smith

Perl is a great solution for web development.

Others will disagree but the best way I still believe is using mod_perl - but 
only if you use it's full power - and you probably need a special sort of mind 
set to use - but that can be said for any language.

From experience - it may be fractionally slower than small "standalone" apps 
that dancer etc are good at, but it is (a) much, much more stable {dancer etc 
does not cope well with either large requests or lots of small requests}, and 
(b) if you have a large code base and/or a large number of services then it 
generally uses much less compute power than the others {can easily handle 
multiple services on a single apache instance}

Where it really gains is the hooks into the apache process - being able to add 
functionality easily at any stage in the request process, from path 
translation, AAA stages, pre-processing, to post-processing and logging, and 
also to interact with other languages at any stage - e.g. can handle 
pre-processing & post-processing around a script written in another language 
(e.g. PHP, Java) or produced by another webserver integrated by mod_proxy.

It isn't really a framework though like dancer or mojolicious and thus has its 
own advantages and disadvantages.

You would to some extent have to roll your own code to produce the pages 
themselves although there are libraries out there to do lots of it.

We have an in house library whose embryonic stages were written over 20 years 
ago - and has now been stable for around 12-13 years and works strong...

James

-Original Message-
From: Wesley Peng  
Sent: 04 August 2020 06:43
To: modperl@perl.apache.org
Subject: suggestions for perl as web development language [EXT]

greetings,

My team use all of perl, ruby, python for scripting stuff.
perl is stronger for system admin tasks, and data analysis etc.
But for web development, it seems to be not as popular as others.
It has less selective frameworks, and even we can't get the right people to do 
the webdev job with perl.
Do you think in today we will give up perl/modperl as web development language, 
and choose the alternatives instead?

Thanks & Regards




-- 
 The Wellcome Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE.

Re: HTTP and MPM support

2019-01-27 Thread Dr James Smith

I would prefer to see a mod_perl 2.6 or 3 against perl 5 rather than 
perl 6 - I think it wouldn't go to far against perl 6 as there isn't the 
uptake - we would be unlikely to migrate to a perl 6 backend - there is 
too much pressure already to move to an alternative language (python at 
the moment - until we point out most of it's pitfalls when it comes to 
web work!) that I don't think we would justify moving to a Perl 6 
environment where there is little or no code to rely on!


On 26/01/2019 22:02, Sive Lindmark wrote:

Hi!

Perhaps in the same boat? Our company can sponsor between $1000 - $1 to a 
project developing modPerl 3 (Perl 6 + stabil mpm-worker/event)

If some more in the boat can tribute also it would be awasame ...

/Sive




--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: HTTP and MPM support

2019-01-25 Thread Dr James Smith

Agree with this we use AAA handlers - but more importantly output 
filters to allow content to be decorated per site (independent of what 
generates the content perl/java/php proxied content etc...} and add in a 
few useful extra logging features that rely on things like transHandlers 
and log & cleanup handlers you just can't quite get these working well 
with plack/PSGI.


There is a balance between HTTP/1.1 and HTTP/2.0 that you can strike by 
mixing a matching backends dependant on content - I use three apaches on 
my machines one using event handler which receives all requests, and 
servers some static content and proxies back to either a dev apache or 
live apache dependent on what the URL is...


This gives me the best of both worlds

On 25/01/2019 20:15, Paul B. Henson wrote:

On 1/25/2019 11:00 AM, Michael A. Capone wrote:

I have to add my voice to the growing chorus here.


Me too. Frequently when the topic of mod_perl going stale comes up 
somebody jumps in with "That's old stuff, you should be using 
PSGI/Plack". Those people simply don't understand the overall utility 
of mod_perl beyond simply running a webapp . I have 
authentication and authorization handlers written in mod_perl, and the 
ability to directly access the Apache API allows things that PSGI 
simply cannot do.



--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Future MPM Support?

2018-06-09 Thread Dr James Smith

No - because of the way it works it handles the request inside apache - 
the worker/event systems work by handing the request back to another 
process or processes in the background which handles the request and 
then returns - which is where the problem lies in the fact that you are 
effectively adding a proxy layer between the web-request and the actual 
perl process...


It limits what you can do with Plack when it comes to handling aspects 
of the request which are better handled outside the main response phase 
{e.g. re-write, logging, cleanup etc} which limits functionality - most 
people who just use response handlers do not see this issue. But we hook 
into about 10 phases of the apache process ...



On 08/06/2018 02:08, John Dunlap wrote:

Does using mod_perl properly allow you to use mpm_event or mpm_worker?

On Thu, Jun 7, 2018 at 9:19 PM, Dr James Smith <mailto:j...@sanger.ac.uk>> wrote:


Unfortunately Plack (and Catalyst especially) are a fairly poor
comparison to using mod_perl properly {unfortunately very few
people do so} I've looked at Dancer and Catalyst - both are OK at
what they do - but they don't really handle things in the really
clean easy way that mod_perl does {if you attach code to the right
handlers/filters} meaning chopping in and changing code can be
quite difficult in them.

Both are good for simplish applications {yes and I've seen complex
apps written in them as well - but they usually need a lot more
hardware support than the equivalent mod_perl app to cope with demand}

Unfortunately writing good mod_perl apps is hard - and so few
mod_perl apps really make use of the underlying framework properly
- effectively using it for code caching and not much else



On 07/06/2018 19:24, David Hodgkinson wrote:

Moving your method handlers to the framework.

I like catalyst. Stand on the shoulders of giants. Mojolicious
makes me itch.

On 7 Jun 2018, at 19:21, John Dunlap mailto:j...@lariat.co>> wrote:


What is involved in porting an application from mod_perl to starman?


Throwing away logic and logical structure and replacing it with a
much less flexible approach...

On Thu, Jun 7, 2018 at 6:18 PM, Clive Eisen
mailto:cl...@hildebrand.co.uk>> wrote:


On 7 Jun 2018, at 19:13, David Hodgkinson
mailto:daveh...@gmail.com>> wrote:

No. Different concept.

On 7 Jun 2018, at 18:52, John Dunlap mailto:j...@lariat.co>> wrote:


Is Plack backwards compatible with mod_perl?

On Thu, Jun 7, 2018 at 5:44 PM, David Hodgkinson
mailto:daveh...@gmail.com>> wrote:

We’re all about the Plack these days.



This.

We have moved entirely to

nginx (doing the ssl where appropriate) -> starman (which
uses plack) and Dancer2

Life is a LOT better

—
Clive




-- 
John Dunlap

/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>



-- The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose
registered office is 215 Euston Road, London, NW1 2BE

<https://maps.google.com/?q=215+Euston+Road,+London,+NW1+2BE&entry=gmail&source=g>.





--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>





--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Future MPM Support?

2018-06-07 Thread Dr James Smith

Unfortunately Plack (and Catalyst especially) are a fairly poor 
comparison to using mod_perl properly {unfortunately very few people do 
so} I've looked at Dancer and Catalyst - both are OK at what they do - 
but they don't really handle things in the really clean easy way that 
mod_perl does {if you attach code to the right handlers/filters} meaning 
chopping in and changing code can be quite difficult in them.


Both are good for simplish applications {yes and I've seen complex apps 
written in them as well - but they usually need a lot more hardware 
support than the equivalent mod_perl app to cope with demand}


Unfortunately writing good mod_perl apps is hard - and so few mod_perl 
apps really make use of the underlying framework properly - effectively 
using it for code caching and not much else




On 07/06/2018 19:24, David Hodgkinson wrote:

Moving your method handlers to the framework.

I like catalyst. Stand on the shoulders of giants. Mojolicious makes 
me itch.


On 7 Jun 2018, at 19:21, John Dunlap > wrote:



What is involved in porting an application from mod_perl to starman?

Throwing away logic and logical structure and replacing it with a much 
less flexible approach...
On Thu, Jun 7, 2018 at 6:18 PM, Clive Eisen > wrote:



On 7 Jun 2018, at 19:13, David Hodgkinson mailto:daveh...@gmail.com>> wrote:

No. Different concept.

On 7 Jun 2018, at 18:52, John Dunlap mailto:j...@lariat.co>> wrote:


Is Plack backwards compatible with mod_perl?

On Thu, Jun 7, 2018 at 5:44 PM, David Hodgkinson
mailto:daveh...@gmail.com>> wrote:

We’re all about the Plack these days.



This.

We have moved entirely to

nginx (doing the ssl where appropriate) -> starman (which uses
plack) and Dancer2

Life is a LOT better

—
Clive




--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co /
/
*Customer Service:*/
877.268.6667
supp...@lariat.co 





--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: capture exception

2017-05-30 Thread James Smith




On 2017-05-30 03:49 PM, Dirk-Willem van Gulik wrote:


On 30 May 2017, at 16:43, John Dunlap > wrote:


How is it a security hole?

….


> my $ret = eval { $m->...() };



Just imagine $m->…() returning something containing a valid perl 
expression such as " `rm -rf /‘; “, system(“rm -rf /“);  or something 
that wires up a shell to a TCP socket.


Dw.

But that isn't how it works - the "{" "}" brace means $m->...() is run - 
but the output is trapped... the two types of eval are different




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: capture exception

2017-05-30 Thread James Smith

String eval should be avoided at all costs [especially if you parse user 
input] - functional eval is different - and is a good model for catching 
errors etc


{There are some good uses of string eval - e.g. dymanically "use"ing 
modules}


James


On 2017-05-30 03:46 PM, Ruben Safir wrote:

Using eval is an unacceptable security bug for all online and public
access programs that aquire data from external non-secured sources.



On Tue, May 30, 2017 at 09:39:53AM -0400, John Dunlap wrote:

Yes, I do that extensively and it works perfectly. It's as close to a true
Try/Catch block as we have in the perl world. However, I *usually* do not
return values from it because I use this construct to control my database
transaction demarcation and using the return value from outside of the eval
wouldn't be inside the transaction. With that said, I have had to do it
from time to time and it works just fine. Also, it is advisable to copy the
contents of $@ into a separate variable immediately. My understanding is
that this can prevent some weird concurrency issues, under some conditions.
My general form looks something like this,

my $return = eval {
 # BEGIN DATABASE TRANSACTION

 # DO SOME STUFF

 # COMMIT DATA BASE TRANSACTION

 return 'SOME VALUE';
};

if ($@) {
 my $error = $@;

 # ROLLBACK DATABASE TRANSACTION

 # LOG ERROR
}


On Tue, May 30, 2017 at 4:47 AM, James Smith  wrote:


Not really a mod_perl question but you can always wrap your method call in
an eval

my $ret = eval { $m->...() };

And then check $@ for the error message


On 2017-05-26 02:08 AM, Peng Yonghua wrote:


greeting,

I am not so good at perl/modperl,:)

In the handler, a method from a class was called, when something dies
from within the method, what's the correct way the handler will take?

for example, I wrote this API which works right if given a correct domain
name:

http://fenghe.org/domain/?d=yahoo.com

server response:
var data={"registration":"domain may be taken","domain":"yahoo.com"}

If given a wrong domain name:

http://fenghe.org/domain/?d=yahoo.nonexist

The server returns 500.

This is because, in the handler, I used this module (wrote also by me):

http://search.cpan.org/~pyh/Net-Domain-Registration-Check-0.
03/lib/Net/Domain/Registration/Check.pm

And in the module, croak like this was happened,

croak "domain TLD not exists" unless tld_exists($tld);

When handler meets the croak, it dies (I guess) and server returns 500.

How will I make the full system work right? fix on handler, or the module
itself?

Thanks.




--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a company
registered in England with number 2742969, whose registered office is 215
Euston Road, London, NW1 2BE.




--
John Dunlap
*CTO | Lariat *

*Direct:*
*j...@lariat.co *

*Customer Service:*
877.268.6667
supp...@lariat.co







--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: capture exception

2017-05-30 Thread James Smith

Not really a mod_perl question but you can always wrap your method call 
in an eval


my $ret = eval { $m->...() };

And then check $@ for the error message


On 2017-05-26 02:08 AM, Peng Yonghua wrote:

greeting,

I am not so good at perl/modperl,:)

In the handler, a method from a class was called, when something dies 
from within the method, what's the correct way the handler will take?


for example, I wrote this API which works right if given a correct 
domain name:


http://fenghe.org/domain/?d=yahoo.com

server response:
var data={"registration":"domain may be taken","domain":"yahoo.com"}

If given a wrong domain name:

http://fenghe.org/domain/?d=yahoo.nonexist

The server returns 500.

This is because, in the handler, I used this module (wrote also by me):

http://search.cpan.org/~pyh/Net-Domain-Registration-Check-0.03/lib/Net/Domain/Registration/Check.pm 



And in the module, croak like this was happened,

croak "domain TLD not exists" unless tld_exists($tld);

When handler meets the croak, it dies (I guess) and server returns 500.

How will I make the full system work right? fix on handler, or the 
module itself?


Thanks.




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: mod_perl -> application server

2017-04-06 Thread James Smith

You can use mod_perl properly and write your self a request handler - 
rather than using CGI scripts which handles the routing for you. I use 
this model exclusively on my servers...


Most of the scripts are converted to action modules, which are 
dynamically compiled by the handler (which acts as a router/controller)


It uses apreq2 (APR) rather than CGI

I think CGI::Application is (or at least was deprecated)


On 2017-04-06 11:05 AM, Tosh Cooey wrote:
Hi, after the recent discussion here about Perl application servers I 
realized that the architecture I designed is probably better suited to 
usage with an application server than mod_perl.






--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: mod_perl Website Hosting

2017-03-09 Thread James Smith

As I want to stay in the UK - I've been using bigV.io services from 
bytemark - slighlty more expensive - and you have to set up from scratch 
- but really nice VMs and not difficult to set-up - and I get to set 
them up exactly as I want them



On 2017-03-09 04:04 PM, Vincent Veyron wrote:

On Sun, 5 Mar 2017 12:40:10 -0500
 wrote:


Just trying to update my knowledge about
website hosting services.
Can anyone recommend hosting companies
that have a good track record of hosting mod_perl
applications?

I'm late on this but As Randolf said, your best bet is probably a dedicated 
server. I've been using low end dedicated servers from online.net and 
kimsufi.com at 10.00 euros/month for years, no problem at all

I even see an offer at 4.99€/month now :

https://www.kimsufi.com/fr/serveurs.xml







--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Cache refresh each 50 queries?

2016-10-05 Thread James Smith

You can look at Apache::SizeLimit as an alternative - this is designed 
to cope with applications which occassionaly "leak memory"..


If one requests uses a lot of memory it will not be recovered -- Perl 
doesn't hand this memory back - so subsequent requests are handled by 
the inflated process.


We use mod_perl handlers to track this to see which requests cause the 
memory inflation to happen - these can then be looked at to see if they 
can be optimized! Check that large datastructures are not being copied 
around - avoiding processing one large data structure into another large 
data structure etc


On 10/5/2016 11:39 AM, A. Warnier wrote:

Hi again.

I want to correct somewhat what I was saying below, in "Additional hint".
See : 
http://perl.apache.org/docs/1.0/guide/performance.html#Sharing_Memory


What I was saying about "leaking memory" below remains true.

But even with applications which do not "leak memory" per se, there 
will anyway be some increase in the amount of memory that each Apache 
child is using, over time.
So the "best" setting for "MaxRequestsPerChild", in an Apache/mod_perl 
scenario, may not be 0 ( = unlimited).  Instead, use some relatively 
large number, such as 1000 for example.
That will cause a child to serve 1000 requests before it gets replaced 
by a new one.
And having a slightly longer delay every 1000 requests, would probably 
not even be noticed. (Of course, if your server crashes with an OOM 
before you reach 1000, then you will have to reduce that number, 
experimentally).


The reason is well explained in that page above.


On 05.10.2016 11:30, A. Warnier wrote:

Additional hint :

Normally, when Apache goes through the "effort" of setting up a new 
child process, you
would want that this child process then runs for as long as possible 
(for as many requests
as possible), to avoid the repeated overhead of restarting another 
child process.

In an Apache prefork configuration, you would do this by setting

MaxRequestsPerChild 0   (or a very high number)

(as per 
http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild)


However, some perl modules/applications are known to "leak memory" 
(meaning that each time
they are called, the Apache child memory footprint increases a bit, 
and never goes down
anymore).  And after a while, depending on how busy the site is and 
how many times that
application is called, Apache may crash if it does not have enough 
memory anymore.


That is why, in some cases, you would set this MaxRequestsPerChild to 
a value different
from 0, to limit the number of requests that any Apache child would 
process, before it is

forcefully terminated by Apache, and replaced by a new one.
(The exact number used, would depend on how much additional memory it 
uses after each

request).
Terminating this child process would return the used memory to the 
OS, and the new child

would start again fresh, with a minimum amount of memory.
Then it will start growing again, until it hits the limit again, etc..

So, setting MaxRequestsPerChild <> 0, is in fact a "quick-and-dirty 
fix" for a problem of
the application (it should not do that, leaking memory). And it would 
be better to fix the

application, so that it can run without leaking memory..

P.S.
You still have not indicated what your OS is, or what kind of Apache 
MPM you are using.
If you need more help, please do that the next time you post a 
message here.



On 05.10.2016 10:53, SUZUKI Arthur wrote:

Hello David, André,

the server we're working with is fully dedicated to Koha.

Thank you André for your hints about Apache configurations.

@David, same thing, I'll try to preload C4::ILSDI:Services and see 
if it helps.


Thanks to both of you, I'll let you know about the results.
Cheers,
Arthur

Le 05/10/2016 à 01:20, David Cook a écrit :
Sounds like André is on the right track. I've certainly run into a 
similar

issue (with a non-Koha app).

However, because I was using the Catalyst framework, I was able to 
just
preload the entire app, so that the Perl modules were loaded into 
the Apache
master process before forking, and that did the trick. That app is 
a lot
smaller than Koha though too. This case is a bit more complicated 
since Koha
isn't really a MVC app, but you could look at the Koha Plack 
examples and

see which modules they pre-load.

You might also want to run ilsdi.pl with Devel::NYTProf to identify 
what

exactly is slowing down that 1s long query.

I've only played with PerlOptions +Parent a bit, but I'm guessing 
that you
have multiple mod_perl apps, and that we're not actually seeing 
your entire

httpd.conf configuration relating to Koha, right?

David Cook
Systems Librarian
Prosentient Systems
72/330 Wattle St
Ultimo, NSW 2007
Australia

Office: 02 9212 0899
Direct: 02 8005 0595



-Original Message-
From: André Warnier [mailto:a...@ice-sa.com]
Sent: Tuesday, 4 October 2016 10:02 PM
To: modperl@perl.apache.org
Subject: Re: Cache refresh each 50 q

Re: Recommended Linux distribution for LAMP/mod_perl

2016-10-03 Thread Dr James Smith

We tend to now use Ubuntu LTS set ups for our webservers - currently a 
mix of 12.04, 14.04 and 16.04 depening on which part of the production 
cycle we are on (yes we have at least 60 for approx 120 different 
websites)...



On 03/10/2016 18:09, John Dunlap wrote:
You're going to be better off with Debian than you will be with CentOS 
because Debian actually ships with precompiled mod_perl packages.


On Mon, Oct 3, 2016 at 1:08 PM, daniel.axtell > wrote:


I've been trying to migrate a site with a lot of Perl legacy code
running under Apache 2.2 and mod_perl.  The server I was migrating
to uses CentOS 7, and the default Apache 2.4 and perl 5.16 seem
unusually difficult to configure.  I'm not even able to get CGI
scripts to run.   In the past I've built Perl, Apache and mod_perl
from source, but that seems like a lot of unnecessary work. 
Ideally I'd like to use the stock Apache and Perl from the

distribution, and just install CPAN modules, data and config files
and go.  I'm curious if people here find a particular Linux
distribution Perl and mod_perl friendly, as the RedHat and CentOS
distributions seem pretty hostile.  CentOS 7 has a third-party
module of mod_perl 2.0.8 but if I can't get CGI working correctly
I don't really trust it.

Should I just assume building everything in the LAMP stack from
source is the way to go?

Dan




--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co /
/
*Customer Service:*/
877.268.6667
supp...@lariat.co 





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Alternatives to CGI perl module

2016-09-11 Thread Dr James Smith

CGI.pm is still good - but i you are using modperl "properly" then it is 
worth look at APR, which when you use CGI.pm under mod_perl it is what 
is used under the hood... and is faster than CGI (one less level of 
abstraction) although there is a minor bug in it - in that is broken if 
you use modperl when it doesn't expand the environment.


We moved away from CGI to using APR as this made life in our mod_perl 
framework.


Yes you can look at PSGI and Plack - but in our situation we found quite 
a number of issues causing DOS attacks in this setup - some of our 
requests are very heavy and we wanted to selectively choose multiple 
layers (as in different combinations of handlers) to the application - 
both things PSGI/Plack's design doesn't like. Also we have 100s of 
applications running on the same server which doesn't work nicely in the 
plack environment - multiple servers would really be required which adds 
to server over head...


As for templates - invariably if you want to apply sitewide style to a 
site they are the WRONG solution - but I've had this debate with many 
people and most don't see the point - until I point out how simple I can 
change the way tables work on a site... It does involve writing good 
CSS/JS and not relying on the presentational frameworks which are get 
far to common place!



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: which framework is best suitable for modperl?

2016-07-25 Thread James Smith




On 7/20/2016 4:04 PM, Steven Lembark wrote:

On Wed, 20 Jul 2016 11:55:24 +0800
yhp...@orange.fr wrote:


Though I have written several handlers using mp2, but for further web
development under modperl, what framework do you suggest to go with?

Q: What do you mean by "framework"?
  

(I have few experience on Dancer, which I don't think work together
with MP).

Frankly, you are better off with Plack and PSGI (a.k.a. Dancer,
Dancer2, twiggy, etc) than trying to embed anything into apache
with mod_perl. There are modules for converting mod_perl request
objects into plack, which simplifies all of your code and makes
debugging it a whole lot easier (since you can use perl -d
rather than dealing with mock apache objects).
That isn't the case if you make use of all the nice features of mod_perl 
then
there just isn't the support in Plack to do the same thing - you can do 
really

nasty coding to get round some of it (and it is nasty).
You don't have the same flexibility in turning on and off phases like you do
in mod_perl - and so you end up with having multiple layers of plack
objects - which adds to inefficiency...
But when it comes to output filters and the like then there really isn't an
equivalent in Plack to achieve the same flexibility...

We have applications in perl, javascript (node), java and php which are
hooking into the request phase of a perl server but are wrapped with
mod_perl layers to achieve templating security etc. It is not quick and
easy to do this with Plack in quite as seemless away.

We have also noticed performance issues with many of the Plack 
implementations

they are great when the traffic is low - but unless you run under mod_perl -
they don't tend to cope with large (heavy request) traffic as well as 
mod_perl.


this




If you have to run code w/in apache then try the PSGI interface:



You can test the code with perl -d and run it in anything from
twiggy to starman to apache when you are done.





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: ApacheCon: Getting the word out internally

2016-07-19 Thread James Smith




On 7/19/2016 9:58 AM, yhp...@orange.fr wrote:

Jie,

I have been using Apache::DBI, but I don't think it is something like 
JDBC.


Thankfully not - JDBC is one of the biggest nightmares our DBAs face - 
if we have
network issues (firewall session timeout e.g.) we have had all sorts of 
problems with

oracle jdbc connections.

We actually don't even use Apache::DBI (we are primarily MySQL) as the 
"uncontrolled"
caching of connections can lead to problems - flooding database servers 
with open
connections, connections failing because of firewall issues. We 
explicitly cache (and
reconnect) using DBIx::Connector to cope with the few databases we 
explicitly want

vast connections...

Admitedly many of our servers have upward of 30-40 apps on them, talking 
to 40-50

different databases (quite often on the same server!)




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: ApacheCon: Getting the word out internally

2016-07-19 Thread James Smith




On 7/19/2016 4:26 AM, yhp...@orange.fr wrote:

so, will go for support of perl6?

Probably once it becomes more prevelant - the perl6 community is still 
relatively small
{moving current perl 5 codebases to it will be none-trivial} and most 
will not see the
gain from doing so... It will take time for traction (similar to other 
moves like Drupal 7 -> 8)


It is a different language and you have to think differently and code 
differently to use...


perl 5 and mod_perl are still going strong... the point is that for the 
best part of 20 years
both have been doing well and doing stuff right - they need tweaks here 
and there but
you don't need to do huge amounts of development on top of what is 
already one of the

best base levels for web development...



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Bad rap

2016-06-14 Thread James Smith

We have upwards over 200 people creating content and services - in 
differening frameworks, languages ... on the current servers we maintain 
in mod_perl the last count we had about 30-40 developers writing perl 
code over probably 30 or 40 different applications within the same 
cluster of webservers


James

On 6/14/2016 3:52 PM, John Dunlap wrote:
Though, if you have no control over what apps you have to support and 
they are wirtten in multiple architectures... I can totally see where 
you're coming from.


On Tue, Jun 14, 2016 at 10:48 AM, James Smith <mailto:j...@sanger.ac.uk>> wrote:


We have multiple apps, and we just switch in and out our auth/page
wrapper/logging/debugging code as we need to, if some one else
comes up with an app we tell them what they do to get the
information and "jobs a good-un", much simpler than having
multiple embedded login/authentication/... methods... We know
because other projects use the style frameworks you are talking
about - and we go you just XYZ, and then realize that they are
using some nginx/psgi/starman solution and have to go - aargh - no
you can't just do that - you will have to re-engineer your app!

James

On 6/14/2016 3:40 PM, John Dunlap wrote:

We don't use any of those hooks into Apache. mod_perl invokes our
main handler and, from there, we do everything ourselves. We even
built our own authentication and authorization mechanisms,
directly into our application, instead of relying on Apache to
provide them. We've contained all mod_perl specific code to 2-3
files so that we have more freedom to decide how and where our
application will be deployed.

On Tue, Jun 14, 2016 at 10:37 AM, James Smith mailto:j...@sanger.ac.uk>> wrote:


On 6/14/2016 3:28 PM, John Dunlap wrote:

https://www.nginx.com/blog/nginx-vs-apache-our-view/


Unfortunately for us we actually use some of those 500 things
that apache is good at, that nginx doesn't do:

  * Making use of all the handler/filter hooks in apache;
  * Fronting a complex web-application, where requests by
definition take a long time to return (the databases and
related queries are complex)



-- The Wellcome Trust Sanger Institute is operated by Genome
Research Limited, a charity registered in England with number
1021457 and a company registered in England with number
2742969, whose registered office is 215 Euston Road, London,
NW1 2BE.




-- 
John Dunlap

/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>



-- The Wellcome Trust Sanger Institute is operated by Genome
Research Limited, a charity registered in England with number
1021457 and a company registered in England with number 2742969,
whose registered office is 215 Euston Road, London, NW1 2BE.




--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Bad rap

2016-06-14 Thread James Smith

We have multiple apps, and we just switch in and out our auth/page 
wrapper/logging/debugging code as we need to, if some one else comes up 
with an app we tell them what they do to get the information and "jobs a 
good-un", much simpler than having multiple embedded 
login/authentication/... methods... We know because other projects use 
the style frameworks you are talking about - and we go you just XYZ, and 
then realize that they are using some nginx/psgi/starman solution and 
have to go - aargh - no you can't just do that - you will have to 
re-engineer your app!


James

On 6/14/2016 3:40 PM, John Dunlap wrote:
We don't use any of those hooks into Apache. mod_perl invokes our main 
handler and, from there, we do everything ourselves. We even built our 
own authentication and authorization mechanisms, directly into our 
application, instead of relying on Apache to provide them. We've 
contained all mod_perl specific code to 2-3 files so that we have more 
freedom to decide how and where our application will be deployed.


On Tue, Jun 14, 2016 at 10:37 AM, James Smith <mailto:j...@sanger.ac.uk>> wrote:



On 6/14/2016 3:28 PM, John Dunlap wrote:

https://www.nginx.com/blog/nginx-vs-apache-our-view/


Unfortunately for us we actually use some of those 500 things that
apache is good at, that nginx doesn't do:

  * Making use of all the handler/filter hooks in apache;
  * Fronting a complex web-application, where requests by
definition take a long time to return (the databases and
related queries are complex)



-- The Wellcome Trust Sanger Institute is operated by Genome
Research Limited, a charity registered in England with number
1021457 and a company registered in England with number 2742969,
whose registered office is 215 Euston Road, London, NW1 2BE.




--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co <mailto:j...@lariat.co>/
/
*Customer Service:*/
877.268.6667
supp...@lariat.co <mailto:supp...@lariat.co>





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Bad rap

2016-06-14 Thread James Smith



On 6/14/2016 3:28 PM, John Dunlap wrote:

https://www.nginx.com/blog/nginx-vs-apache-our-view/

Unfortunately for us we actually use some of those 500 things that 
apache is good at, that nginx doesn't do:


 * Making use of all the handler/filter hooks in apache;
 * Fronting a complex web-application, where requests by definition
   take a long time to return (the databases and related queries are
   complex)





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Bad rap

2016-06-13 Thread James Smith

All our experiences at work with nginx/psgi & nginx/fastcgi are poor - 
it is very good if any of your queries takes any length of time and/or 
the fastcgi/psgi requests are requested a lot relative to the static 
content served by nginx then there are quite significant 
error/performance issues In our case the only static files are 
mainly images.. The rest of the content is dynamic - whether it is 
server cached pages or real dynamic content...


We have a load balancing proxy in-front of our apaches so we can fork 
content elsewhere that is to be served fast! We don't because Apache 
itself is fast enough! Admittedly we have taken a lot of care to reduce 
the overall number of requests to a minimum (page, 1 CSS, 1 JS + a 
handful of images per page)


The hacks we would have to do in PSGI/FastCGI to get these features 
would probably be negated by the move away from Apache


Apache is fast enough if you use it properly!!


On 6/13/2016 11:58 AM, John Dunlap wrote:


Speaking as someone would like to migrate to Nginx, at some point, the 
big advantage of Nginx really has nothing to do with mod_perl. It has 
to do with Apache. The way Apache processes requests is fundamentally 
slower than Nginx and, consequently, Nginx scales better.


On Jun 13, 2016 6:54 AM, "James Smith" <mailto:j...@sanger.ac.uk>> wrote:


Just posted:

mod_perl is a much better framework that PSGI, FastCGI IF you make
use of the integration of perl into all the stages of apache (you
can hook into about 15 different stages in the Apache life cycle.

We make of extensive use of the input, output filters, AAA-layers,
clean up, logging, server startup, etc processes then it is one of
the best web frameworks you can use.

We have sites where content is produced by either being static,
mod_perl, php, and java (or proxied back from some ancient CGI
software) all processed by the same mod_perl code in the output
filter to look the same! or different if was using a different site!

If all you are interested in is wrapping CGI scripts in a cached
interpreter for performance then yes you can move to one of these
other frameworks - but then you have already spent lots of time
and effort implementing the features that are virtually free with
apache/mod_perl!


On 6/11/2016 7:11 PM, Vincent Veyron wrote:

Hi all,

See this post on reddit :


https://www.reddit.com/r/linuxadmin/comments/4n5seo/apache_22_mod_perl_to_nginx/

Please help set the record straight. Ancient technology WTF?





-- The Wellcome Trust Sanger Institute is operated by Genome
Research Limited, a charity registered in England with number
1021457 and a company registered in England with number 2742969,
whose registered office is 215 Euston Road, London, NW1 2BE.






--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Bad rap

2016-06-13 Thread James Smith


Just posted:

mod_perl is a much better framework that PSGI, FastCGI IF you make use 
of the integration of perl into all the stages of apache (you can hook 
into about 15 different stages in the Apache life cycle.


We make of extensive use of the input, output filters, AAA-layers, clean 
up, logging, server startup, etc processes then it is one of the best 
web frameworks you can use.


We have sites where content is produced by either being static, 
mod_perl, php, and java (or proxied back from some ancient CGI software) 
all processed by the same mod_perl code in the output filter to look the 
same! or different if was using a different site!


If all you are interested in is wrapping CGI scripts in a cached 
interpreter for performance then yes you can move to one of these other 
frameworks - but then you have already spent lots of time and effort 
implementing the features that are virtually free with apache/mod_perl!



On 6/11/2016 7:11 PM, Vincent Veyron wrote:

Hi all,

See this post on reddit :

https://www.reddit.com/r/linuxadmin/comments/4n5seo/apache_22_mod_perl_to_nginx/

Please help set the record straight. Ancient technology WTF?







--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: close connection for request, but continue

2016-04-21 Thread James Smith

A job queue is also better because it stops un-controlled forking or 
excessive numbers of "dead" web connections hanging around. It will just 
queue requests until resources are available.. You may find handling 
multiple of these jobs in parallel eats up all your processor/memory 
resources.. Where queuing you can limit the number of process running in 
parallel you have. (and if your site gets bigger you may be able to hand 
off some of this to a cluster of machines to handle the long running 
process)



On 4/21/2016 3:25 PM, Perrin Harkins wrote:
On Thu, Apr 21, 2016 at 9:48 AM, Iosif Fettich > wrote:


I'm afraid that won't fit, actually. It's not a typical Cleanup
I'm after - I actually want to not abandon the request I've
started, just for closing the incoming original request. The
cleanup handler could relaunch the slow back request - but doing
so I'd pay twice for it.


You don't have to. You can just return immediately, and do all the 
work in the cleanup (or a job queue) while you let the client poll for 
status. It's a little extra work for simple requests, but it means all 
requests are handled the same and you never make extra requests to 
your expensive backend.


If you're determined not to do polling from the client, your best bet 
is probably to fork immediately and do the work in the fork, while you 
poll to check if it's done in your original process. You'd have to 
write the response to a database or something that the original 
process can pick it up from. But forking from mod_perl is a pain and 
easy to mess up, so I recommend doing one of the other approaches.


- Perrin





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Upgrade to Apache 2.4 breaks encoding in a PerlOutputFilterHandler

2015-11-29 Thread Dr James Smith

I knew it was a problem - but due to our set up of front end proxy / 
mod_perl then this wasn't an issue.. the mod_perl server handles the 
filter - and the front end proxy does the gzipping (we use Brocade 
Traffic Mangers and Apache in different places) - in most production 
environments this is the "usual" setup...


On 29/11/2015 18:15, Vincent Veyron wrote:

On Sun, 29 Nov 2015 19:59:28 +1100
Jie Gao  wrote:

Well, check out this: 
https://perl.apache.org/docs/2.0/user/handlers/filters.html#C_PerlOutputFilterHandler_
 .


Hi Jie,

Yes, the instructions on this page work well; but with those, I need to add 
'SetOutputFilter DEFLATE' in every virtual host.

I use a global configuration in deflate.conf, and got it to work in 2.4 by 
replacing :

#   AddOutputFilterByType DEFLATE text/html text/plain text/xml

with this:

FilterDeclare   COMPRESS CONTENT_SET

FilterProvider  COMPRESS DEFLATE "%{CONTENT_TYPE} =~ m#^text/(html|plain)#"
FilterChain COMPRESS
FilterProtocol  COMPRESS  DEFLATE change=yes;byteranges=no

Which places mod_deflate after my PerlOutputFilterHandler in the chain.

I was surprised nobody else had the problem before, though?





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Apache2 filter

2015-10-02 Thread James Smith

perl -cw sometimes throws errors with mod_perl code - as it isn't 
running in the Apache

environment...

I get the same warning testing my output filter handler when running 
with -cw - but it

works well in Apache...!

On 10/1/2015 6:59 PM, A. Warnier wrote:

Hi.

I am trying to write an Apache2 request filter.
According to the online tutorial 
(http://perl.apache.org/docs/2.0/user/handlers/filters.html#Output_Filters). 
I have this so far :


package MyFilter;
...
use base qw(Apache2::Filter);
...
use constant BUFF_LEN => 4096;

sub handler : FilterRequestHandler {
my $f = shift;
my $content = '';

while ($f->read(my $buffer, BUFF_LEN)) {
$content .= $buffer;
}
}

 but when I compile this :

aw@arthur:~/tests$ perl -cw PAGELINKS.pm
Invalid CODE attribute: FilterRequestHandler at PAGELINKS.pm line 50.
BEGIN failed--compilation aborted at PAGELINKS.pm line 50.
aw@arthur:~/tests$

platform data (from Apache log) :
[Tue Sep 01 06:25:10 2015] [notice] Apache/2.2.16 (Debian) DAV/2 
SVN/1.6.12 mod_jk/1.2.30 mod_apreq2-20090110/2.7.1 mod_perl/2.0.4 
Perl/v5.10.1 configured -- resuming normal operations


There are already many other mod_perl modules of all kinds running on 
that same server (but not filters).


What I am missing ?

André




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Random segmentation fault

2015-09-06 Thread Dr James Smith


John,

Sometimes it's difficult to see what the error is because you can't see 
the request (doesn't get logged)


To get round this - add:

 * a transhandler which writes a tag (e.g. ST), the request and the PID
   to the error log
 * a cleanuphandler which does the same... with a different tag (e.g. FI)

you can then get a better idea of what is causing the error as the 
request that causes the seg-fault will have a ST just before the seg 
fault but no FI... you will also have a history of all the request 
handled by that PID (in case it is cumulative)


Sometime (about 12 years) ago we were having errors with apparently 
random requests (including static images) - doing this we discovered the 
request which died was the request after a request which talked to a 
particular Oracle database.


On the live site we just killed the child at the end of these 
requests... and then went back to diagnose the error...


James

On 03/09/2015 22:21, John Dunlap wrote:
Ever since upgrading from Debian 7 - which shipped with Apache 2.2 - 
to Debian 8 - which shipped with Apache 2.4 - my user base has been 
reporting that their browsers randomly tell them "No data received". 
To date, they have not been able to identify any kind of pattern which 
triggers it. I've been sifting through the server logs looking for 
problems and I'm seeing a lot of errors similar to the following:
[Thu Sep 03 21:12:52.382357 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2088 exit signal Segmentation 
fault (11)
[Thu Sep 03 21:13:03.406215 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2121 exit signal Segmentation 
fault (11)
[Thu Sep 03 21:13:05.417909 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2165 exit signal Segmentation 
fault (11)
[Thu Sep 03 21:13:08.433829 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2232 exit signal Segmentation 
fault (11)
[Thu Sep 03 21:15:53.614351 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2264 exit signal Segmentation 
fault (11)
[Thu Sep 03 21:16:03.637236 2015] [core:notice] [pid 13199:tid 
140364918835072] AH00052: child pid 2539 exit signal Segmentation 
fault (11)



Can someone give me some tips on how to proceed with troubleshooting 
this and, possibly, fixing it?


--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co /
/
*Customer Service:*/
877.268.6667
supp...@lariat.co 





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Enquiry about mod_perl project state

2015-08-15 Thread Dr James Smith


I agree with Randolf,

I have watched a number of projects move away from mod_perl - often to 
Dancer/

Catalyst etc and then they ask can I do X, Y or Z...

I say "if you were using mod_perl you could do that easily" but usually 
find the tool chain for
doing something similar in Dancer or Catalyst is infinitely more 
complex, because you can't
simply add a handler here or an output filter there. You can do this 
with chained middleware

in Dancer but it is not that easy to implement...

The main thing with mod_perl is - as long as Apache doesn't mess around 
with the core HTTPD
interface (like they did with 2.4) then there isn't anything much that 
needs changing - because
it is good solid and just works... If it ain't broke then it doesn't 
need changing.


I probably use more "pure" mod_perl - no registry scripts here thank 
you! - and hook into the
apache process at about 7- 8 places to handle users, templating, 
diagnostics, optimisation etc


On 14/08/2015 18:51, Randolf Richardson wrote:

Hello,

I wanted to enquire about the status of mod_perl, since there is largely an
impression it is end of life. The project site also does not say much. I
see many of the mod_perl shops now moving to perl Dancer/Mojolicious etc.
or going the Java way.

I'm still using mod_perl to develop new web sites.  The most recent
one I've published is called the "atheist Blog Roll" and it uses a
PostgreSQL database in the back-end:

http://www.atheistblogroll.com/

There are other projects on my horizon that continue to be developed
in mod_perl, and they range from simple web sites to fully
interactive projects.

When there is a need for a client-side application, I use Java
because I only have to write the code once to gain support on
multiple Operating Systems (e.g., Unix/Linux, Windows, MacOS), and if
it needs to interact with a web site, then I typically use mod_perl
for that end of things.

For one of the projects I'm working on (which is not ready for
public consumption quite yet), I've also written a WHOIS server using
mod_perl (which listens on TCP port 43, and responds to queries based
on its findings in PostgreSQL) to facilitate public membership record
lookups (only for the portions that will be publicly accessible).


What is the future of mod_perl beyond mod_perl 2.0? What is the upgrade
path recommended by the mod_perl veterans?

When I upgrade, I'm normally installing new server hardware and so I
migrate sites over one at a time, and resolve any API change
requirements before promoting the new server to production (followed
by log file merges after switching servers and traffic to the old
servers cease).


Regards,
Ashish

Randolf Richardson - rand...@inter-corporate.com
Inter-Corporate Computer & Network Services, Inc.
Beautiful British Columbia, Canada
http://www.inter-corporate.com/






--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Large File Download

2015-03-31 Thread Dr James Smith


On 28/03/2015 19:54, Issac Goldstand wrote:

sendfile is much more efficient than that.  At the most basic level,
sendfile allows a file to be streamed directly from the block device (or
OS cache) to the network, all in kernel-space (see sendfile(2)).

What you describe below is less effective, since you need to ask the
kernel to read the data, chunk-by-chunk, send it to userspace, and then
from userspace back to kernel space to be sent to the net.

Beyond that, the Apache output filter stack is also spending time
examining your data, possibly buffering it differently than you are (for
example to make HTTP chunked-encoding) - by using sendfile, you'll be
bypassing the output filter chain (for the request, at least;
connection/protocol filters, such as HTTPS encryption will still get in
the way, but you probably want that to happen :)) further optimizing the
output.

If you're manipulating data, you need to stream yourself, but if you
have data on the disk and can serve it as-is, sendfile will almost
always perform much, much, much better.

In the cases I was pointing out (in line with the original request) that 
streaming the
data is more efficient that writing it to disk and then using 
sendfile (reading is

more efficient than writing - from experience)

Depends on the cost of producing the file - the end time response for 
the user may
well be less by streaming it out than writing it to disk - and then 
sending it to the
user with sendfile - they will already have most of the file (network 
permitting)

before the last chunk of content is produced

I won't re-iterate the rest of the issues with memory management that 
are achieved
but also I will also point out that if you ever write a file to disk you 
are putting your
server at risk - either from a security or dos attack point of view - so 
if you can avoid

it - do so!


---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Large File Download

2015-03-28 Thread Dr James Smith

You can effectively stream a file byte by byte - you just need to print 
a chunk at a time and mod_perl and apache will handle it 
appropriately... I do this all the time to handle large data downloads 
(the systems I manage are backed by peta bytes of data)...


The art is often not in the output - but in the way you get and process 
data before sending it - I have code that will upload/download arbitrary 
large files (using HTML5's file objects) without using excessive amounts 
of memory... (all data is stored in chunks in a MySQL database)


Streaming has other advantages with large data - if you wait till you 
generate all the data then you will find that you often get a time out - 
I have a script which can take up to 2 hours to generate all the output 
- but it never times out as it is sending a line of data at a time 
and do data is sent every 5-10 seconds... and the memory footprint is 
trivial - as only data for one line of output is in memory at a time..



On 28/03/2015 16:25, John Dunlap wrote:
sendfile sounds like its exactly what I'm looking for. I see it in the 
API documentation for Apache2::RequestIO but how do I get a reference 
to it from the reference to Apache2::RequestRec which is passed to my 
handler?


On Sat, Mar 28, 2015 at 9:54 AM, Perrin Harkins > wrote:


Yeah, sendfile() is how I've done this in the past, although I was
using mod_perl 1.x for it.

On Sat, Mar 28, 2015 at 5:55 AM, André Warnier mailto:a...@ice-sa.com>> wrote:

Randolf Richardson wrote:

I know that it's possible(and arguably best practice)
to use Apache to
download large files efficiently and quickly, without
passing them through
mod_perl. However, the data I need to download from my
application is both
dynamically generated and sensitive so I cannot expose
it to the internet
for anonymous download via Apache. So, I'm wondering
if mod_perl has a
capability similar to the output stream of a java
servlet. Specifically, I
want to return bits and pieces of the file at a time
over the wire so that
I can avoid loading the entire file into memory prior
to sending it to the
browser. Currently, I'm loading the entire file into
memory before sending
it and

Is this possible with mod_perl and, if so, how should
I go about
implementing it?


Yes, it is possible -- instead of loading the
entire contents of a file into RAM, just read blocks in a
loop and keep sending them until you reach EoF (End of File).

You can also use $r->flush along the way if you
like, but as I understand it this isn't necessary because
Apache HTTPd will send the data as soon as its internal
buffers contain enough data.  Of course, if you can tune
your block size in your loop to match Apache's output
buffer size, then that will probably help.  (I don't know
much about the details of Apache's output buffers because
I've not read up too much on them, so I hope my
assumptions about this are correct.)

One of the added benefits you get from using a
loop is that you can also implement rate limiting if that
becomes useful.  You can certainly also implement access
controls as well by cross-checking the file being sent
with whatever internal database queries you'd normally use
to ensure it's okay to send the file first.


You can also :
1) write the data to a file
2) $r->sendfile(...);
3) add a cleanup handler, to delete the file when the request
has been served.
See here for details :
http://perl.apache.org/docs/2.0/api/Apache2/RequestIO.html#C_sendfile_

For this to work, there is an Apache configuration directive
which must be set to "on". I believe it is called "UseSendFile".
Essentially what senfile() does, is to delegate the actual
reading and sending of the file to Apache httpd and the
underlying OS, using code which is specifically optimised for
this purpose. It is much kore efficient than doing this in a
read/write loop by yourself, at the cost of having less fine
control over the operation.





--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co /
/
*Customer Service:*/
877.268.6667
supp...@lariat.co 




---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



--
The Wellcome Trust Sanger Institute i

Re: mod_perl for multi-process file processing?

2015-02-02 Thread Dr James Smith


Alan/Alexandr,

There will always be an overhead with using a webserver to do this - 
even using mod_perl.


Assumiptions:
*from what you are saying that there is no actual website 
involved but you want to use mod_perl to cache data for an offline process;

*One set of data is used once and once only for a run?

Pros:
*Make sure you use your module in startup so that each child 
thread uses the same memory not generating a copy of the data;
*If you use something like curl multi as the fetcher you can 
write a simple parallel fetching queue to get the data - great if you 
have a multi-core box;


Cons:
*There is an overhead of using HTTP protocol webserver - if you 
aren't going to gain much from the parallelization of processes above 
you may find that writing a simple script which loops over all data 
would be more efficient...
*In your case we are probably looking at about 10ms (or less) 
the apache/http round tripping will probably take much more time than 
the actual processing...


On 03/02/2015 05:02, Alexandr Evstigneev wrote:
Pre-loading is good, but what you need, I belive, is Storable module. 
If your files contains parsed data (hashes) just store them as 
serialized. If they containing raw data, need to be parsed, you may 
pre-parse, serialize it and store as binary files.

Storable is written in C and works very fast.


2015-02-03 7:11 GMT+03:00 Alan Raetz >:


So I have a perl application that upon startup loads about ten
perl hashes (some of them complex) from files. This takes up a few
GB of memory and about 5 minutes. It then iterates through some
cases and reads from (never writes) these perl hashes. To process
all our cases, it takes about 3 hours (millions of cases). We
would like to speed up this process. I am thinking this is an
ideal application of mod_perl because it would allow multiple
processes but share memory.

The scheme would be to load the hashes on apache startup and have
a master program send requests with each case and apache children
will use the shared hashes.

I just want to verify some of the details about variable sharing. 
Would the following setup work (oversimplified, but you get the

idea…):

In a file Data.pm, which I would use() in my Apache startup.pl
, I would load the perl hashes and have hash
references that would be retrieved with class methods:

package Data;

my %big_hash;

open(FILE,"file.txt");

while (  ) {

  … code ….

  $big_hash{ $key } = $value;
}

sub get_big_hashref {   return \%big_hash; }



And so in the apache request handler, the code would be something
like:

use Data.pm;

my $hashref = Data::get_big_hashref();

…. code to access $hashref data with request parameters…..



The idea is the HTTP request/response will contain the relevant
input/output for each case… and the master client program will
collect these and concatentate the final output from all the requests.

So any issues/suggestions with this approach? I am facing a
non-trivial task of refactoring the existing code to work in this
framework, so just wanted to get some feedback before I invest
more time into this...

I am planning on using mod_perl 2.07 on a linux machine.

Thanks in advance, Alan






---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Perl + DBD-Oracle, problems with encoding when "PerlHandler Apache::Registry" is in use

2014-11-23 Thread Dr James Smith


On 23/11/2014 05:42, Ruben Safir wrote:

did you ever get this worked out.

I'm looking to use perl with my oracle set up.  Any tips would be
appreciated.
I had similar problems Apache 2.2 with DBD::Oracle, after serious 
debugging there
was a nasty environment variable issue - DBD::Oracle when first 'use'd 
was not
finding the appropriate oracle environment variables, and consequently 
wasn't
pulling in oracle defaults (and reading things like tnanames file) and 
then caching
the broken values - subsequent 'use's were failing because it wasn't 
reading the
oracle config from disk - but using it's own cached values from the 
first "use"..


James

Ruben


On Fri, Feb 19, 2010 at 05:47:51PM +0700, michael kapelko wrote:

Hello.
Here's a short script I used to find out the problem with the Apache::Registry:

#!/usr/bin/perl -wT
use strict;
use warnings;
use CGI;
use DBI;
use DBI qw(:sql_types);
use encoding 'utf-8';

my $cgi = new CGI;
print $cgi->header(-type=> "text/html",
   -charset => "utf-8");
print $cgi->start_html(-lang => "ru-RU",
   -title => "Title");
print $cgi->h1("Title");
my $db = DBI->connect("DBI:Oracle:SID=ELTC;HOST=10.102.101.4",
  , , {RaiseError => 1,
 AutoCommit => 0,
 ora_charset => "UTF8"});
my $query = "select name from swmgr2.vw_switches where sw_id_ip =
231504"; // Selects Russian "name" from DB in UTF-8, because on
the previouse line we asked Oracle to return data to us in UTF-8.
my $stmt = $db->prepare($query);
$stmt->execute();
my $name;
$stmt->bind_columns(undef, \$name);
$stmt->fetch();
$stmt->finish();
$db->disconnect();
print $cgi->p($name);
print $cgi->end_html();

When invoked directly by the shell or in web page WITHOUT "PerlHandler
Apache::Registry", the UTF-8 encoded string in Russian is printed just
fine. But when "PerlHandler Apache::Registry" is used, only  are
printed in web page.
Thanks.

.



---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Apache2::Connection::remote_ip

2014-11-20 Thread Dr James Smith


On 20/11/2014 22:39, John Dunlap wrote:

Could you give us a link to the documentation you are using?

On Thu, Nov 20, 2014 at 5:38 PM, worik > wrote:


Can't locate object method "remote_ip" via package
"Apache2::Connection


You are probably looking at 2.2 documentation for a 2.4 Apache...

I tend to use:
($r->connection->can('remote_ip') ? $r->connection->remote_ip : 
$r->connection->client_ip );


(In fact have a wrapper function to do this)

so the code base will (does) work on 2.2 & 2.4...

james


Turns out I should have used  $r->useragent_ip

What is going on?  Why is the documentation at odds with the code?

W
--
Why is the legal status of chardonnay different to that of cannabis?
worik.stan...@gmail.com 
021-1680650, (03) 4821804
  Aotearoa (New Zealand)
 I voted for love




--
John Dunlap
/CTO | Lariat/
/
/
/*Direct:*/
/j...@lariat.co /
/
*Customer Service:*/
877.268.6667
supp...@lariat.co 




---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Disconnect database connection after idle timeout

2014-11-13 Thread Dr James Smith


On 13/11/2014 15:43, Perrin Harkins wrote:
On Thu, Nov 13, 2014 at 10:29 AM, Xinhuan Zheng 
mailto:xzh...@christianbook.com>> wrote:


We don’t have any front end proxy.


I think I see the problem... ;)

If you use a front-end proxy so that your mod_perl servers are only 
handling mod_perl requests, and tune your configuration so that idle 
mod_perl servers don't sit around for long, that should avoid 
timeouts. Apache::DBI should also re-connect with no problems if a 
request comes in after a connection has timed out. If that isn't 
happening, make sure you are using Apache::DBI properly.


Seriously, using a front-end proxy usually reduces the number of 
databases connections about 10 times. It's the easiest fix here by far.


- Perrin
From experience - and having chatted with our DBAs at work, with modern 
Oracle and with MySQL keeping persistent connections around is no real 
gain and usually lots of risks... We have a number of issues with 
inter-machine connection - as we have a number of firewalls in place...


If the connection failure is not due to either end shutting down but the 
network failing in between Apache::DBI will (cannot) do the right thing 
- the ping check just hangs as it tries to the broken connection (and as 
it isn't connected it waits forever to respond)


You are far better using non-persistent connections - but where we do 
use persistent connections (cache/session database) we set a timeout in 
the adaptor level which won't use a connection if it has been idle for 
more than a given period of time, but instead drops the connections and 
re-builds it


Most of the DB servers now cache queries etc on the server so persistent 
connections are not required to take advantage of these.


Additionally if you have a load balanced situation where you have a 
number of backends - each talking back to a large number of databases 
(at least in MySQL it is very easy to reach DB connection limits)


   7 backends x 50 processes x 300 databases ~ 100,000 connections if 
each child process has a database connection to each database...


perhaps this is a bit extreme (the project involved is a very complex 
genomic interface) but persistent connections used to be it's achilles 
heel... where we worry about database connection issues we often use 
"DBIx::Connector" as a better way to handle disconnects




---
This email is free from viruses and malware because avast! Antivirus protection 
is active.
http://www.avast.com



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Custom response problem

2014-03-18 Thread James Smith



Try:

use Apache2::Response ();

This should add the method to $r (a lot of the Apache2:: modules do this 
- Apache2::RequestUtil, Apache2::Upload etc)

On 18/03/2014 16:16, John Dunlap wrote:
I've tried it with "use Apache2::RequestRec;" at the top of my handler 
and without it. The outcome is the same in both cases. If I attempt to 
install it from CPAN, it says that it is already installed.



On Tue, Mar 18, 2014 at 12:13 PM, Andreas Mock > wrote:


Have you loaded Apache2::RequestRec?

Best regards

Andreas Mock

*Von:*John Dunlap [mailto:j...@lariat.co ]
*Gesendet:* Dienstag, 18. März 2014 16:59
*An:* mod_perl list
*Betreff:* Custom response problem

I recently upgraded my workstation from Debian 6 to Debian 7 and
I'm now encountering a problem that I haven't seen before. My
apache version is 2.2.22-13+deb7u1. My mod_perl version is
2.0.7-3. I'm guessing that I have an installation problem of some
kind but I'm not sure where to look for problems. My application
works correctly until I attempt to define a custom response, as
follows,

sub handler {

my $apache = shift;

   
$apache->custom_response(Apache2::Const::HTTP_INTERNAL_SERVER_ERROR,

'hi mom');

return Apache2::Const::HTTP_INTERNAL_SERVER_ERROR;

}

I see a 500 error, which is what I want, when I access the page.
However the error page is the default apache 500 error response
page and I want to override it. When I look in the logs, I see this:

[Tue Mar 18 15:41:32 2014] [error] [client 127.0.0.1] Can't locate
object method "custom_response" via package "Apache2::RequestRec"
at
/usr/local/lariat-trunk/qa-trunk/lib/Lariat/V4/WS/RS/BootstrapHandler.pm
line 41.

This would imply, at least to me, that this method is not compiled
into mod_perl or perhaps into apache itself but I cannot be sure.
Any suggestions?

Cheers!

John







--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Tool to create multiple requests

2012-02-07 Thread James Smith


On 07/02/2012 08:58, André Warnier wrote:

Tobias Wagener wrote:

Hello,

I'm currently developing a huge application with mod_perl, unixODBC 
and MaxDB/SAPDB.

On my developing system everything is fine. But on the productive system
with > 50 users, I have database connection errors and request aborts 
and

so on.

Now I want to ask if someone knows a tool or perl modules, where I 
can simulate
50 users. I have a list with some common request including the query 
parameter in order of appearence. But I don't know, how to send them 
to my developing system to create the same load as it will be on the 
productive system.


Can someone help me with this issue?


As a simple tool, have a look at the "ab" program that comes with Apache.

ab isn't usually that much use as it does strain the databases in the 
same way - especially as it only takes 1 URL - any "production" scaling 
e.g. caching etc will make it even worse. I have a simple one I wrote 
based on curl which can take a list of URLs which is generally a lot 
better.
Without being careful it is still difficult to cope with large sites 
acting like users (you will need to extend the code to do cookie jars!)...


Things we have come across in large production systems are.

 * Apache::DBI sometimes cause issues with too many database
   connections - we tend to turn it off and use DBIx::Connector as
   mentioned (and carefully selected caching) here to cope with
   persistence of connections;
 * You may be serving too many requests on the server to the users
   (other than the main page request) look at
   caching/minimising/merging page design elements - background images,
   javascript, CSS;
 * OR  - serve them from a second apache (use it either as a proxy or
   have a proxy in front of the two apaches)
 * Don't over AJAX pages - lots of "parallel" AJAX requests can
   effectively DOS your service;

Another trick is to add extra diagnostics to your apache logs with a 
couple of mod_perl handlers:



   *Apache configuration file:*

PerlLoadModule  Pagesmith::Apache::Timer
PerlChildInitHandlerPagesmith::Apache::Timer::child_init_handler
PerlChildExitHandlerPagesmith::Apache::Timer::child_exit_handler
PerlPostReadRequestHandler  
Pagesmith::Apache::Timer::post_read_request_handler

PerlLogHandler  Pagesmith::Apache::Timer::log_handler

LogFormat "%V [*%P/%{CHILD_COUNT}e %{SCRIPT_TIME}e* 
%{outstream}n/%{instream}n=%{ratio}n] %h/%{X-Forwarded-For}i 
%l/%{SESSION_ID}e %u/%{user_name}e %t \"%r\" %>s %b \"%{Referer}i\" 
\"%{User-Agent}i\" \"%{Cookie}i\" \"%{X-Requested-With}i\" 
*%{SCRIPT_START}e/%{SCRIPT_END}e*" diagnostic


The following module sets up the four environment variables: 
*CHILD_COUNT, SCRIPT_START, SCRIPT_END, SCRIPT_TIME* so you can see 
which requests are slow and you can also see if you have any 
"clustering" of requests...Setting the "Readonly my $LEVEL" to either 
normal or noisy will give you more diagnostics in the error log as well



   *Module:*

package Pagesmith::Apache::Timer;

## Component
## Author : js5
## Maintainer : js5
## Created: 2009-08-12
## Last commit by : $Author: js5 $
## Last modified  : $Date: 2011-10-26 12:44:20 +0100 (Wed, 26 Oct 2011) $
## Revision   : $Revision: 1489 $
## Repository URL : $HeadURL: 
svn+ssh://web-svn.internal.sanger.ac.uk/repos/svn/shared-content/trunk/lib/Pagesmith/Apache/Timer.pm 
$


use strict;
use warnings;
use utf8;

use version qw(qv); our $VERSION = qv('0.1.0');

use Readonly qw(Readonly);
Readonly my $VERY_LARGE_TIME => 1_000_000;
Readonly my $CENT=> 100;
Readonly my $LEVEL   => 'normal'; # (quiet,normal,noisy)

use Apache2::Const qw(OK DECLINED);
use English qw(-no_match_vars $PID);
use Time::HiRes qw(time);

my $child_started;
my $request_started;
my $requests;
my $total_time;
my $min_time;
my $max_time;
my $total_time_squared;

sub post_config_handler {
  return DECLINED if $LEVEL eq 'quiet';

  printf {*STDERR} "TI:   Start apache  %9d\n", $PID;
  return DECLINED;
}

sub child_init_handler {
  return DECLINED if $LEVEL eq 'quiet';

  $child_started = time;
  $requests  = 0;
  $total_time= 0;
  $min_time  = $VERY_LARGE_TIME;
  $max_time  = 0;
  $total_time_squared  = 0;

  printf {*STDERR} "TI:   Start child   %9d\n", $PID;
  return DECLINED;
}

sub post_read_request_handler {
  my $r = shift;

  return DECLINED if $LEVEL eq 'quiet';

  $request_started = time;
  $requests++;

  return DECLINED unless $LEVEL eq 'noisy';

  printf {*STDERR} "TI:   Start request %9d - %4d  %s\n",
$PID,
$requests,
$r->uri;
  return DECLINED;
}

sub log_handler {
  my $r = shift;

  return DECLINED if $LEVEL eq 'quiet';

  my $request_ended = time;
  my $t = $request_ended - $request_started;

  $total_time += $t;
  $min_time = $t if $t < $min_time;
  $max_time = $t if $t > $max_time;
  $total_time_squared += $t * $t;
  $r->subprocess_env->{'CH

Re: mod_perl output filter and mod_proxy, mod_cache

2011-07-14 Thread James Smith


On 14/07/2011 11:39, Tim Watts wrote:

On 14/07/11 11:16, André Warnier wrote:

Hi Andre,

Thanks for the quick reply :)


(That would probably be difficult, inefficient or both)

Assuming that what you say about Tomcat is true (I don't know, and it
may be worth asking this on the Tomcat list), I can think of another way
to achieve what you seem to want :
if you can distinguish, from the request URL (or any other request
property), the requests that are for invariant things, then you could
arrange to /not/ proxy these requests to Tomcat, and serve them directly
from Apache httpd.


Indeed that is a good idea. We are doing that for new projects for css 
and js files (apache does not proxy certain paths and picks these up 
from the local filesystem).


We can't do that for the 100 odd legacy servers as no-one has time o 
delve into the java/JSP code. I need to do something "outside" of 
tomcat where possible. Just to explain, each web server is a paid-for 
project - and when it's done, it sits there for 5+ years.


Only I have the time/inclination to fix this as it's killing my VMWare 
infrastructure. Because the sites are all fronted by apache in a 
similar way, one solution is likely to apply to most of the sites.


I would also add that most of the sites are "dynamically" driven 
pages, even involving MySQL querying, but once launched, the data 
remains fairly static - eg GET X will always resolve to reponse Y.


I'm planning a small seminar on the value of Cache-Control for my dev 
colleagues so they can stop making this mistake ;-> But that still 
leaves a lot of "done" projects to fix.



Which proxying method exactly are you using between Apache and Tomcat ?
(if you are using mod_proxy, then you are either using mod_proxy_http or
mod_proxy_ajp; you could also consider using mod_jk).


mod_proxy_http specifically.

mod_jk looks interesting for new projects (we have local tomcats for 
those now) - I think it may be a non-starter for old stuff as trying 
to retro fit it may not be so simple (our older tomcat servers are in 
a remote farm on their own machines hence the use of mod_proxy_http).


Shouldn't be an issue you can point the mod_jk to a remote machine - I 
do it a lot so that we can push the Tomcat application out through our 
templating output filter ... The tomcat produces a plain HTML page with 
none of the styling, and this is wrapped using our custom output filter, 
I'm guessing at this stage you can do what you want with the script...


James


Also, what are the versions of Apache and Tomcat that you are using ?



Apache 2.2 (various sub versions) and both tomcat 5.5 and tomcat 6 
(but all on remote machines listening on TCP sockets).


I think for this problem, I have to treat tomcat as a little, rather 
inefficient, black box and try to fixup on the apache front ends, 
hence the direction of my original idea...


Cheers,

Tim





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Ways to scale a mod_perl site

2009-09-17 Thread James Smith


Igor Chudov wrote:

Guys, I completely love this discussion about cookies. You have really
enlightened me.

I think that letting users store cookie info in a manner that is secure
(involves both encryption and some form of authentication), instead of
storing them in a table, could possibly result in a very substantial
reduction of database use.

  
Alternatively store the information in a two level cache! 
memcached/database - with
write through - then most of the time you get the data from memcached - 
you can do

the same with the images...

write entry: -> write data to memcached ; write data to sql cache

read entry: -> read data from memcached and return OR
  read data from sql cache and write to memcached 
and return


Should avoid most database reads! works well for the images you create 
to minimize

database accesses

The cookie is

1) Encrypted string that I want and
2) MD5 of that string with a secret code appended that the users do not
know, which serves as a form of signing

That should work. I will not change it now, but will do if I get 2x more
traffic.

That way I would need zero hits to the database to handle my users sessions.


(I only retrieve account information when necessary)

As far as I remember now, I do not store much more information in a session
beyond username. (I hope that I am not wrong). So it should be easy.

Even now, I make sure that I reset the cookie table only every several
months. This way I would let users stay logged on forever.

Thanks a lot.

Igor

  





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Debugging seg faults in Apache mod_perl

2009-08-17 Thread James Smith



Phillipe,

Thnx, that seems to have solved the problem - I'd been slurping in the 
page template file with a $/.. it also explains why switching from
using perl-script to modperl as the handler type also resolved the 
issue...


James

On Tue, 18 Aug 2009, Philippe M. Chiasson wrote:


On 17/08/09 21:48 , Philippe M. Chiasson wrote:

On 17/08/09 19:54 , James Smith wrote:


Futher to this I now have a stack trace if that helps anyone point me in
the right direction:

#0  0x7f17ee936fb1 in strncpy () from /lib/libc.so.6
#1  0x7f17e808ecfc in modperl_perl_global_request_save () from
  /usr/lib/apache2/modules/mod_perl.so


That's hapenning in a fairly simple piece of code:

modperl_perl_global_svpv_save(pTHX_ modperl_perl_global_svpv_t *svpv)
{
svpv->cur = SvCUR(*svpv->sv);
strncpy(svpv->pv, SvPVX(*svpv->sv), sizeof(svpv->pv));
}

[...]

But looking at the guts, the only code path that will invoke this is
when saving/restoring the value of $/...  Can you check to see what part
of your code alters that variable and to what ?


Just had an idea, can't verify or test it right now, but I suspect this
might be caused by:

$/ = undef;

Could you have one of these lying around in your code somewhere ?

If so, can you try and change it to the usually recommended form:

local $/ = undef;

Hope this helps.

--
Philippe M. Chiasson GPG: F9BFE0C2480E7680 1AE53631CB32A107 88C3A5A5
http://gozer.ectoplasm.org/   m/gozer\@(apache|cpan|ectoplasm)\.org/





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Debugging seg faults in Apache mod_perl

2009-08-17 Thread James Smith



Futher to this I now have a stack trace if that helps anyone point me in
the right direction:

#0  0x7f17ee936fb1 in strncpy () from /lib/libc.so.6
#1  0x7f17e808ecfc in modperl_perl_global_request_save () from
  /usr/lib/apache2/modules/mod_perl.so
#2  0x7f17e807dabc in modperl_response_handler_cgi () from
  /usr/lib/apache2/modules/mod_perl.so
#3  0x7f17ef7272d3 in ap_run_handler () from /usr/sbin/apache2
#4  0x7f17ef72aa6f in ap_invoke_handler () from /usr/sbin/apache2
#5  0x7f17ef7385de in ap_process_request () from /usr/sbin/apache2
#6  0x7f17ef735418 in ?? () from /usr/sbin/apache2
#7  0x7f17ef72eca3 in ap_run_process_connection () from
  /usr/sbin/apache2
#8  0x7f17ef73cf46 in ?? () from /usr/sbin/apache2
#9  0x7f17ef73d276 in ?? () from /usr/sbin/apache2
#10 0x7f17ef73ddad in ap_mpm_run () from /usr/sbin/apache2
#11 0x7f17ef71360d in main () from /usr/sbin/apache2


On Mon, 17 Aug 2009, James Smith wrote:



I have two handlers, one a response handler and a second an output filter.
If either of these handlers run then they run fine for any number of 
requests, if I have both of these handlers I get an untraceable seg fault

with the handlers, this segfault happens on the second request to that
particular child.

If the page is generated from:

* the file system,
* via mod_php or
* mod_rails I don't have a problem

If the page content is generated by mod_perl it works perfectly well for
the first request - but fails for subsequent requests.

[notice] child pid 14475 exit signal Segmentation fault (11)

This appears to be before the page handler executes...

If I turn off the output filter - everything is OK for all requests.

Any suggestions on how I debug this... it makes life so much easier if
I can handle all requests this way (and it seems to be the apache way)

James


--
The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a 
charity registered in England with number 1021457 and a company registered in 
England with number 2742969, whose registered office is 215 Euston Road, 
London, NW1 2BE.



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Debugging seg faults in Apache mod_perl

2009-08-17 Thread James Smith



I have two handlers, one a response handler and a second an output filter.
If either of these handlers run then they run fine for any number of 
requests, if I have both of these handlers I get an untraceable seg fault

with the handlers, this segfault happens on the second request to that
particular child.

If the page is generated from:

 * the file system,
 * via mod_php or
 * mod_rails I don't have a problem

If the page content is generated by mod_perl it works perfectly well for
the first request - but fails for subsequent requests.

[notice] child pid 14475 exit signal Segmentation fault (11)

This appears to be before the page handler executes...

If I turn off the output filter - everything is OK for all requests.

Any suggestions on how I debug this... it makes life so much easier if
I can handle all requests this way (and it seems to be the apache way)

James


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Apache output filters and mod_deflate...

2009-08-11 Thread James Smith

Has anyone had experience of Apache output filters and mod_deflate - I'm 
getting some strange behaviour if the apache output filter is generating 
multiple buckets... and am looking for someone to give me some advice - 
as using output filters and mod_deflate would be the perfect solution 
for the problem I have (see previous email)




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: mod_perl Output Filters and mod_deflate

2009-08-06 Thread James Smith


André Warnier wrote:
.. and sorry again for sending directly to you.  I keep forgetting 
this list doesn't set this automatically.


James Smith wrote:

Has anyone had experience of using mod_perl OutputFilters with
mod_deflate, I've been banging my head against a brick wall today
I've learnt a lot about bucket brigades - but for every two steps
forward it's one step back...

Scenario:

static page - being wrapped with an output filter - works with or
without mod_deflate

dynamic page - either Perl or PHP, without mod_deflate the output of
the page gets re-processed and the correct HTML is produced, but with
mod_deflate the content returned is the unwrapped output (i.e. the
output before the output filter has been actioned?)

any ideas or ways to resolve this...

below is my code

package Sanger::Web::DecoratePage;


..
Sorry, I did not go through your code.
I assume however that this code has some way of detecting whether it
should do something to this output or not.

If mod_deflate sees the output first however, it will compress it.
Your filter, if it comes second, may not recognise this compressed
output anymore, and thus do nothing but let it through. Same for PHP.

Alternatively, mod_deflate is being a bad boy, and disables any filters
that may come after it.  In a way this would be justified, because it
really needs to be at the end of the chain.

Unfortunately, I am rich in guesses but poor in solutions.




According to the documentation - mod_deflate should be being run after 
my filter as mod_deflate is a connection filter where mine is request 
filter!


My filter is being run - and getting the right input content - observed as I 
was dumping the HTML coming in and going out!




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

mod_perl Output Filters and mod_deflate

2009-08-05 Thread James Smith


Has anyone had experience of using mod_perl OutputFilters with
mod_deflate, I've been banging my head against a brick wall today
I've learnt a lot about bucket brigades - but for every two steps
forward it's one step back...

Scenario:

static page - being wrapped with an output filter - works with or
without mod_deflate

dynamic page - either Perl or PHP, without mod_deflate the output of
the page gets re-processed and the correct HTML is produced, but with
mod_deflate the content returned is the unwrapped output (i.e. the
output before the output filter has been actioned?)

any ideas or ways to resolve this...

below is my code

package Sanger::Web::DecoratePage;

use strict;
use warnings;
no warnings qw(uninitialized);

use base qw(Apache2::Filter);

use Apache2::RequestRec ();

use APR::Table ();
use APR::Bucket ();
use APR::Brigade ();
use Apache2::Connection ();

use Apache2::Const -compile => qw(OK DECLINED CONN_KEEPALIVE);
use APR::Const -compile => ':common';

use constant LENGTH => 2048;
sub decorate_page {
  my $content = shift;
  my $title   = $content =~ /(.*)<\/title>/s ? $1 : "untitled
document";
  my $body= $content =~ /(.*)<\/body>/s   ? $1 : $content;
  return qq(
  
Wellcome Trust Sanger Institute: $title
  
Wellcome Trust Sanger Institute:

$body

Footer content
  

);
}

sub handler : FilterRequestHandler {
  my ($filter, $bb) = @_;
  my $bb_ctx = APR::Brigade->new($filter->c->pool,
$filter->c->bucket_alloc);

  my $t = $filter->r->headers_in();
  return Apache2::Const::DECLINED unless $filter->r->status == 200;
  warn "\n";
  warn "\n";
  warn ".. handler [[\n";
  foreach my $key (keys %$t) {
warn sprintf "  %40s = %s\n", $key, $t->{$key};
  }
  warn "]]\n";
  my $ctx  = context( $filter );
  # pass through unmodified
  return Apache2::Const::DECLINED if $ctx->{state};
  my $data = exists $ctx->{data} ? $ctx->{data} : '';

  $ctx->{invoked}++;
  my( $bdata, $seen_eos, $beos ) = flatten_bb($bb);

  $data .= $bdata if $bdata;
  if ($seen_eos) {
$data = decorate_page( $data );
my $len = length $data;
$filter->r->headers_out->set('Content-Length', $len);
$filter->r->headers_out->set('Filter-Actioned',
'Sanger::Web::DecoratePage' );
$filter->r->content_type( "text/html" );

if( $data ) {
  while( $data ) {
my $x = substr($data,0,LENGTH);
substr($data,0,LENGTH) = '';
warn sprintf "inserting bucket  %4d :
%10s...%10s",length($x),substr( $x,0,10),substr($x,-10,10);
my $b = APR::Bucket->new( $bb->bucket_alloc, " $x" );
$bb_ctx->insert_tail($b);
  }
}
$bb_ctx->insert_tail( $beos );

warn "
<<<
$data
 >> (length $len)
\n" if 0;
$ctx->{state}++
  } else {
warn "... store ...";
# store context for all but the last invocation
$ctx->{data} = $data;
$filter->ctx($ctx);
  }
  $t = $filter->r->headers_out();
  warn "\n\n.. [[\n";
  foreach my $key (keys %$t) {
warn sprintf "  %40s = %s\n", $key, $t->{$key};
  }
  warn "]]\n";
warn "\n";
warn "end of handler\n";
warn "\n";
  my $rv = $filter->next->pass_brigade($bb_ctx);
  return $rv unless $rv == APR::Const::SUCCESS;

  return Apache2::Const::OK;
}

sub flatten_bb {
  my $bb = shift;
  my $seen_eos = 0;
  my @data;
  my $b = $bb->first;
  while( $b ) {
$b->remove;
if($b->is_eos) {
  $seen_eos++;
  last
}
$b->read(my $bdata);
push @data, $bdata;
$b = $bb->next( $b );
  }
  return (join('', @data), $seen_eos, $b );
}

sub context {
  my ($f) = shift;
  my $ctx = $f->ctx;
  warn "... $ctx ...";
  warn $f->c->keepalive,"; ",Apache2::Const::CONN_KEEPALIVE,";
",$f->c->keepalives;
  unless ($ctx) {
$ctx = {
  state   => 0,
  keepalives  => $f->c->keepalives,
};
use Data::Dumper;
warn Data::Dumper::Dumper( $ctx );
$f->ctx($ctx);
return $ctx;
  }
  my $c = $f->c;
  if(
$c->keepalive  == Apache2::Const::CONN_KEEPALIVE &&
$ctx->{state}&&
$c->keepalives >  $ctx->{keepalives}
  ) {
$ctx->{state}  = 0;
$ctx->{keepalives} = $c->keepalives;
  }
  return $ctx;
}

1;


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: AC US 2008

2008-11-06 Thread James Smith



On Oct 22, 2008, at 3:15 PM, Philip M. Gollucci wrote:


Hi All,

wondering who is going to present at the Apache US 2008 conference  
in New Orleans.


I'll be there 11/2 - 11/9


Made it.  Forgot to respond to this thread earlier.  I'll be here  
until Saturday morning (9th).  Presenting Friday afternoon.

--
James Smith <[EMAIL PROTECTED]>
Texas A&M University, College of Liberal Arts
Digital Humanities Lead Developer
979.845.3050

Re: Any success with storing photos in a database?

2008-09-29 Thread James Smith




On Tue, 30 Sep 2008, Cosimo Streppone wrote:

In data 30 settembre 2008 alle ore 00:09:52, James Smith <[EMAIL PROTECTED]> 
ha scritto:



On Mon, 29 Sep 2008, Cosimo Streppone wrote:

In data 29 settembre 2008 alle ore 23:45:05, James Smith 
<[EMAIL PROTECTED]> ha scritto:


There are good reasons to store images (especially small ones) in 
databases (and with careful management of headers in your mod_perl).



If you have "proper" metadata, you can go and delete your files.


We have "proper" meta data


Yes, sorry. I was thinking about our case.


we produce and delete anwhere up to and including 1/2
million files per day - and
the deletion is the crippling stage on a journalled file system


I see. Again, our case is very different though.
99,99% is create/add/modify and we almost never delete.

What filesystem/os do you use?


It has to be a shared filesystem - so at the moment GPFS/red had; we are 
moving over to a memcached system to store and server the temporary images;

otherwise we would be stuck with NFS or Lustre both of which fail
quite badly with small files.. all this is backed by fibre attached
SAN storage

(they really are temporary and can be easily restored)

James



--
Cosimo



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Any success with storing photos in a database?

2008-09-29 Thread James Smith




On Mon, 29 Sep 2008, Cosimo Streppone wrote:

In data 29 settembre 2008 alle ore 23:45:05, James Smith <[EMAIL PROTECTED]> 
ha scritto:


There are good reasons to store images (especially small ones) in databases 
(and with careful management of headers in your mod_perl).


Some of you have missed inherent problems with the file systems
even balanced heirarchical tree - ones in a shared server
environment which can lead to gross efficiencies - in your cases
you may not be doing multiple deletes - but in the examples I work
with it is not the creation and storage which breaks the file
system, but the requirement to clear our old files before filling
up the file system.


If you have "proper" metadata, you can go and delete your files.
In our case, we chose to hash our paths by basically user-id,
so every file owned by a user is in the same folder and
can be deleted without any problems.



We have "proper" meta data - deleting files is in all file systems an
expensive operation, if you have a large number of files to delete,
the overhead of deleting files can become excessive - we produce and
delete anwhere up to and including 1/2 million files per day - and
the deletion is the crippling stage on a journalled file system


Maybe I didn't get your point.

--
Cosimo



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Any success with storing photos in a database?

2008-09-29 Thread James Smith




There are good reasons to store images (especially small ones) in 
databases (and with careful management of headers in your mod_perl).


Some of you have missed inherent problems with the file systems
even balanced heirarchical tree - ones in a shared server
environment which can lead to gross efficiencies - in your cases
you may not be doing multiple deletes - but in the examples I work
with it is not the creation and storage which breaks the file
system, but the requirement to clear our old files before filling
up the file system.


--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Some perl regex help

2008-06-30 Thread James Smith




On Sun, 29 Jun 2008, Alexander Burrows wrote:



Hello again all. Been a while since I've posted here but needed some help on
a regex I was trying to write.

$line =~ tr/(\(|\)|<|>)/(\(|\)|\<|\>)/g;


Simplest approach is to make a hash of the substitutions and use
an "e" executed regexp

my %hash = ('('=>'(',')'=>')','<'=>'<','>'=>'>');

   $line =~ s/([()<>])/$hash{$1}/eg;



This does not work at all in perl so I found so I replaced the tr with s and
the search part works as expected but the replace does not. I've been trying
to read around forums and regex documents for perl but they seem unorganized
and cryptic. So any help would be appreciated.

-Alexander
--
View this message in context: 
http://www.nabble.com/Some-perl-regex-help-tp18188634p18188634.html
Sent from the mod_perl - General mailing list archive at Nabble.com.





--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: nginx load balance

2008-06-30 Thread James Smith




On Sun, 29 Jun 2008, Perrin Harkins wrote:


On Sat, Jun 28, 2008 at 9:48 AM, Jeff Peng <[EMAIL PROTECTED]> wrote:

But I have a question, does nginx support for session-keeping?
A user's request, should go always to the same original backend server.
Otherwise the user's session will get lost.


I would advise you not to do this.  It's a non-scalable design.  If
you need to keep session data beyond what will fit in an encrypted
cookie, you'd be better off storing it in a shared database.  That
way, if you lose one of your web servers, the session won't get lost.


I would consider using a shared memory solution to save traffic too
from the database server (consider a solution based on memcached??)...
I would be very careful about going back to a single machine for each 
request - due to traffic profiles (when a user makes a request there

is a "spike" of requests from the session - all these then get handled
by one machine and not load balanced)...




--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Diagnosing memory usage

2008-06-15 Thread James Smith



Michael - depends on the OS - but you could look at the Apache::SizeLimit 
code which allows kills processes when the memory per process gets large

works well for the system we use at work...

If on a unix/linux based system "top" is your friend as it will indicate
the memory usage per process


On Sun, 15 Jun 2008, Michael Gardner wrote:

I've inherited an existing Apache+mod_perl 1.3.x server. I am not very 
experienced with Apache nor mod_perl, so I have to pick up things as I go 
along.


Recently I built a new version of Apache (1.3.41) with static mod_perl 1.30, 
and it seems to work. The problem is that every few days or so, the server 
apparently runs out of memory, grinding to a halt and necessitating a hard 
reset.


I suspect mod_perl is the primary memory user here, since most of the pages 
we serve are Perl scripts. But I don't know how to go about diagnosing the 
problem, especially since the server gets so bogged down when it happens that 
I can't access it to get info on running processes, memory usage, etc. I have 
noticed that the memory usage of each httpd process seems to grow over time, 
but it's usually very slow growth, and I can't tell if that's really a leak 
or just normal behavior.


Workarounds would be helpful, but naturally I'd prefer to eliminate the cause 
of the problem. I've been looking through the documentation, but haven't made 
much progress so far. How can I get to the bottom of this?


-Michael



--
The Wellcome Trust Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE.

Re: Build static mod_perl and apachev2.2.3 fail

2007-01-06 Thread James Smith

On Sat, 6 Jan 2007, Jonathan Vanasco wrote:

>
> On Jan 6, 2007, at 7:39 AM, LUKE wrote:
>
> > Have anyone build static mod_perl and apache v2.0.59 successfully?
>
> most people have long abandoned the static mod_perl route.  its a
> PITA to maintain ( you have to upgrade mp and apache at the same time )
>
> performance wise, there's basically no difference between static and
> dynamic modperl .  if there is anything, its so negligible that its
> imperceptible.
>

I would prefer to use statically linked mod_perl for an open source
project that we distribute as it is a much easier process to automate
than setting up a dynamically linked mod_perl (managing paths etc) so
would prefer to install it - but it just simply doesn't install so if
someone can explain how to do it - as the instructions in the insall
document don't work - then it would be greatly appreciated.

James

>
> // Jonathan Vanasco
>
> | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - -
> | FindMeOn.com - The cure for Multiple Web Personality Disorder
> | Web Identity Management and 3D Social Networking
> | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - -
> | RoadSound.com - Tools For Bands, Stuff For Fans
> | Collaborative Online Management And Syndication Tools
> | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - -
>
>
>

Re: Logging to a file

2006-09-22 Thread James Smith


Can you get away with using the apache logs to do this - use
mod_log_config and add save your information in a sub_process_env
variable - and add a %{my_env_var} entry into the logging directive?

James

On Fri, 22 Sep 2006, Jonathan wrote:

> I need to introduce some new functionality to my webapp app, and I'm
> not quite sure if i can do it in MP.  Hoping someone here can offer a
> suggestion.
>
> essentially, I have a high traffic syndicated image server that is
> currently serving from a vanilla apache instance.
> right now, its not logging anything - but i need to change that.
>
> i not need to log some basic request info ( comparable to what is in
> the access log ) , along with the value of a certain cookie if it
> exists- for later parsing ( faster than tossing into a db )
>
> composing the info i need to log under mp is trivial
>
> i'm a bit uneasy about actually logging to a file though-- it looks
> like under a prefork model ( i need 2+ servers to handle this ), i'd
> need to lock / open / write / close / unlock the log file per request
>
> does anyone know of a facility that will let me just log
> straightforward ?
>
> if not, i can hack this together using a non-apache  server pretty
> quickly.   i'd rather just limit my configuration files though :)
>

Re: dynamic loading

2006-08-21 Thread James Smith


For the project I work on many of our Modules derive from a common
Root module which contains the following dynamic_use call which is
exactly the same as a use line - but does not fail fatally if the
module isn't there - it also nicely means that the code isn't
loaded up front (as if all blocks of code were included with a
normal use line)... it seems to work OK for the project (which
has a nice plugin system)

our $failed_modules;

sub dynamic_use {
  my( $self, $classname ) = @_;
  unless( $classname ) {
my @caller = caller(0);
my $error_message = "Dynamic use called from $caller[1] (line
$caller[2]) with no classname parameter\n";
warn $error_message;
$failed_modules->{$classname} = $error_message;
return 0;
  }
  if( exists( $failed_modules->{$classname} ) ) {
  #  warn "EnsEMBL::Web::Root: tried to use $classname again - this has
already failed $failed_modules->{$classname}";
return 0;
  }
  my( $parent_namespace, $module ) = $classname =~/^(.*::)(.*)$/ ? ($1,$2)
: ('::',$classname);
  no strict 'refs';
  return 1 if $parent_namespace->{$module.'::'} && %{
$parent_namespace->{$module.'::'}||{} }; # return if already used
  eval "require $classname";
  if($@) {
warn "EnsEMBL::Web::Root: failed to use
$classname\nEnsEMBL::Web::Root: $@" unless $@ =~/^Can't locate /;
$parent_namespace->{$module.'::'} = {};
$failed_modules->{$classname} = $@ || "Unknown failure when
dynamically using module";
return 0;
  }
  $classname->import();
  return 1;
}

sub dynamic_use_failure {
  my( $self, $classname ) = @_;
  return $failed_modules->{$classname};
}



On Mon, 21 Aug 2006, John ORourke wrote:

> Arshavir Grigorian wrote:
>
> > I am wondering if anyone has experience with a framework for
> > dynamically loading certain modules into an application and executing
> > certain code based on whether a certain module is loaded (available or
> > not). By "dynamically", I do not mean loading run-time, only being
> > able to safely exclude certain modules at startup. One place this
> > could come in handy is when selling an app to 2 clients of whom one
> > has purchased all the functionality and the other has not, and you
> > don't want to give them certain module(s). I am guessing one could
> > accomplish this by using a macro processor (m4) to preprocess Perl
> > code before it's "released", but it would probably be somewhat
> > painful.
>
>
> I'm just finishing a plugin architecture which dynamically loads
> available plugins, but in doing so I found some useful CPAN modules:
>
> Module::Optional - exactly what you're after
> Module::Pluggable - very flexible plugin/optional module manager
>
> hth,
> John
>
>

Re: no cookie on redirect : mod_auth_tkt2

2005-08-14 Thread James Smith


Marc,

You will need to get the cookie sent back to the browser,
if you redirect to a URL with no domain in it - Apache
"cleverly" notices this and performs the second request
without going back to the browser. To force return to the
browser include the domain in the URL
  $r->headers_out->set( Location => "http://my.dn.org/login?redirect=1"; );

On Sun, 14 Aug 2005, Marc Lambrichs wrote:

> I'm trying to build a mp2 handler to login using mod_auth_tkt2. I like the 
> idea of probing if
> the client can support cookies, so tried to rebuild it. According to the cgi 
> example there
> is some problem with setting the cookie on a redirect. The same problem - no 
> probe cookie is set
> during the redirect - occurs when I use the following method, and I can't see 
> why.
>
> #-
> sub login: method
> #-
> {
> my ( $class, $r ) = @_;
>
> #$r->print('sub run ', B::Deparse->new->coderef2text( \&run ));
> my $apr = Apache2::Request->new($r);
>
> ### Check if there are any cookies defined
> my $jar = Apache2::Cookie::Jar->new( $r );
> my $cookie = $jar->cookies( 'tkt' );
> my $probe  = $jar->cookies( 'probe' );
> my $has_cookies = $cookie ||
>   $probe  || '';
> if ( ! $has_cookies ){
>### if this is a self redirect warn the user about cookie support
>if ( $apr->param( 'redirect' ) ) {
>   my @remarks = ( 'Your browser does not appear to support cookies.',
>   'This site requires cookies - please turn cookie 
> support' );
>   my $detail = join '&', map { sprintf "detail=%s", $_ } @remarks;
>   $r->internal_redirect_handler( "/error?type=400&$detail" );
>   return Apache2::Const::OK;
>}
>### If no cookies and not a redirect, redirect to self to test cookies.
>else {
>   $cookie = Apache2::Cookie->new( $r,
>   -name   => "probe",
>   -domain => 'my.domain',
>   -value  => 1 );
>   $cookie->path("/");
>   $cookie->bake( $r );
>   $r->headers_out->set( Location => "/login?redirect=1" );
>
>   return Apache2::Const::REDIRECT;
>}
> }
> }
>
>

Re: AW: Logging user's movements

2005-02-04 Thread James Smith

On Fri, 4 Feb 2005, Denis Banovic wrote:

> Hi Leo,
>
> I have a very similar app running in mod_perl with about 1/2 mio hits a day. 
> I need to do some optimisation, so I'm just interessted what optimisations 
> that you are using brought you the best improvements.
> Was it preloading modules in the startup.pl or caching the 1x1 gif image, or 
> maybe optimising the database cache ( I'm using mysql ).
> I'm sure you are also having usage peaks, so it would be interessting how 
> many hits(inserts)/hour can a single server  machine handle approx.
>

Simplest thing to do is hijack the referer logs, and then parse
them at the end. You just need to add a unique ID for each session
(via a cookie or in the URL) which is added to the logs [or placed
in a standard logged variable]

Then write a parser which tracks usage - using referer + page viewed
If you don't want to rely on referers then you could encrypt this in
the URL... (but watch out for search engines who could hammer your site!!)

James

>
> Thanks
>
> Denis
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: Leo Lapworth [mailto:[EMAIL PROTECTED]
> Gesendet: Freitag, 4. Februar 2005 10:37
> An: ben syverson
> Cc: modperl@perl.apache.org
> Betreff: Re: Logging user's movements
>
>
> H
> On 4 Feb 2005, at 08:13, ben syverson wrote:
>
> > Hello,
> >
> > I'm curious how the "pros" would approach an interesting system design
> > problem I'm facing. I'm building a system which keeps track of user's
> > movements through a collection of information (for the sake of
> > argument, a Wiki). For example, if John moves from the "dinosaur" page
> > to the "bird" page, the system logs it -- but only once a day per
> > connection between nodes per user. That is, if Jane then travels from
> > "dinosaur" to "bird," it will log it, but if "John" travels moves back
> > to "dinosaur" from "bird," it won't be logged. The result is a log of
> > every unique connection made by every user that day.
> >
> > The question is, how would you do this with the least amount of strain
> > on the server?
> >
> I think the standard approach for user tracking is a 1x1 gif, there are
> lots of ways of doing it, here are 2:
>
> Javascript + Logs - update tracking when logs are processed
> 
> -
>
> Use javascript to set a cookie (session or 24 hours) - if there isn't
> already one. Then use javascript to do a document write to the gif.
>
> so /tracker/c.gif?c=&page=dinosaur
>
> It should then be fast (no live processing) and fairly easy to extract
> this information from the logs and into a db.
>
> Mod_perl - live db updates
> -
> Alternatively if you need live updates create a mod_perl handle that
> sits at /tracker/c.gif, processes the parameters and puts them into a
> database, then returns a gif (I do this, read the gif in and store it
> as a global when the module starts so it just stays in memory). It's
> fast and means you can still get the benefits of caching with squid or
> what ever.
>
> I get about half a million hits a day to my gif.
>
> I think the main point is you should separate it from your main content
> handler if you want it to be flexible and still allow other levels of
> caching.
>
> Cheers
>
> Leo
>
>
> 
> Virus checked by G DATA AntiVirusKit
> Version: AVK 15.0.2702 from 26.01.2005
> Virus news: www.antiviruslab.com
>
>

Re: Hosting provider disallows mod_perl - "memory hog / unstable"

2004-09-01 Thread James Smith

>
> That's not entirely true. It is in fact the case that mod_perl's
> *upper-bound* on memeroy usage is similar to the equivalent script
> runnung as a cgi.
>
> A well designed mod_perl application loads as many shared libraries as
> possible before Apache forks off the child processes. This takes
> advantage of the standard "copy-on-write" behavior of the fork() system
> call; meaning that only the portions of the process memory that differ
> from the parent will actually take up extra memory, the rest is shared
> with the parent until one of them tries to write to that memory, at
> which time it is copied before the change is made, effectively
> "unsharing" that chunck of memory.
>
> Unfortunately, it's not a perfect world, and the Perl interpreter isn't
> perfect either: it mixes code and data segments together throughout the
> process address space. This has the effect that as the code runs and
> variables/structures are changed, some of the surrounding code segments
> in the memory space are swept up into the memory chunks during a
> copy-on-write, thereby slowly duplicating the memory between processes
> (where the code would ideally be indefinitely shared).
> Fortunately, Apache has a built in defence against this memory creep:
> the MaxRequestsPerChild directive forces a child process to die and
> respawn after a certain number requests have been served, thereby
> forcing the child process to "start fresh" with the maximum amount of
> shared memory.

The bigger problem is that if one of the modules you include in the
pre-fork is foobarred you may/will not be able to start the server...

> In the long run, this means that if you pre-load as many shared
> libraries as possible and tweak the MaxRequestsPerChild directive,
> you'll probably see significantly less memory usage on average. Not to
> mention all the other speed and efficiency increases that you're already
> mod_perl provides.

Apache::SizeLimit is a better approach as it only reaps large children!!

> j
>
>
>
> --
> [EMAIL PROTECTED] (please don't reply to @yahoo)
>
>
> -
> Post your free ad now! Yahoo! Canada Personals
>

-- 
Report problems: http://perl.apache.org/bugs/
Mail list info: http://perl.apache.org/maillist/modperl.html
List etiquette: http://perl.apache.org/maillist/email-etiquette.html

84 matches

Mail list logo