Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources

2016-04-01 Thread Tom Browder
On Fri, Apr 1, 2016 at 10:08 AM, Tom Browder  wrote:
> On Fri, Apr 1, 2016 at 9:39 AM, Timo Paulssen  wrote:
>> The profiler's data blob is a massive, gigantic blob of json (ls the file
>> and you'll see).
>
> Ah, yes:  a 2.8+ million character line!
...
> What about creating a text output in a format something like gprof?
> It looks like tadzik has some code that could be used as a start.

And look at something like:

  https://github.com/open-source-parsers/jsoncpp

to use as the JSON C++ library.

-Tom


Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread yary
Not sure I understand the disagreement.

"the correct buffer size is often per file, not per
program/invocation, so a one-size-fits-all envar is the wrong
approach"- if you're saying "it would be great to have the buffer size
be an option to 'open'," then I agree. It would be nice to have that
settable as a parameter. At the moment, you can set it by file by
changing $*DEFAULT-READ-ELEMS then creating the filehandle- I read the
source, and the handle saves the value it was given. If you change the
variable later and open a new handle, the new handle gets the new
value while the old keeps its old value. While being able to
explicitly set it per handle would be good, the implementation of this
dynamic variable is a workable first step.

" it is rare for the buffer size to be different based on the system
except on systems so small that rakudo isn't going to work on them
anyway (e.g. embedded systems)."

That doesn't make sense to me, before Elizabeth's patch the buffer was
never different on any system, it was always 64K. (And I thought I saw
Rakudo on Raspberry Pi in 2014 or '15 but not sure...)

'I can't see this impacting much in the common case" right this isn't
addressing common cases. it's an "infrequently asked question."

I wasn't even thinking of small systems, but large ones with SAN
networks using clever caching over not-quite-properly configured
VLANs- like what I use at work. Some of our servers use the SAN
through a different VLAN. I don't expect a programmer to fine-tune
file-processing code for my situation (unless they want to on-the-fly
find the best buffer size each run!), I do appreciate being able to
set the buffer in the environment per-system, and I was the original
person asking to be able to set the buffer and I have seen it make a
difference in this use case.

"In the specific case that prompted this thread, it is the programmer
that wants to specify a very large buffer." In my case, I was
comparing perl5 with C#, and I found that for that particular system,
code, and file, a buffer size of 128k was the sweet spot.

"is a small system even going to be able to handle that much data to
begin with?" Not sure how the existence of small systems eliminates
the usefulness of twiddling buffer sizes.

-y


Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread Brandon Allbery
On Fri, Apr 1, 2016 at 11:09 AM, yary  wrote:

> Setting the buffer size is better done by the user, not the
> programmer. Often the user and the programmer are one and the same, in
> which case, the programmer knows the environment and can set the
> environment variables- or change the code- whichever makes better
> sense.


I would disagree with this; having been in both of those seats,

(a) the correct buffer size is often per file, not per program/invocation,
so a one-size-fits-all envar is the wrong approach;

(b) it is rare for the buffer size to be different based on the system
except on systems so small that rakudo isn't going to work on them anyway
(e.g. embedded systems).

64k is a rather large buffer size relative to libc stdio which is usually
4-8k, but rather small compared to many other aspects of rakudo's memory
usage. I can't see this impacting much in the common case. In the specific
case that prompted this thread, it is the programmer that wants to specify
a very large buffer. And the fact that a very large buffer is wanted is
actually a symptom of an even more significant memory issue: is a small
system even going to be able to handle that much data to begin with? So
again, the *buffer size* is not the important part of the equation and
trying to tune it to reduce the impact on the system is attacking the wrong
part of the problem.

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net


Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources

2016-04-01 Thread Timo Paulssen



On 01/04/16 17:08, Tom Browder wrote:



Alternatively, there's a qt-based profiler up on tadzik's github that can
read the json blob (you will have to --profile-filename=blahblah.json to get
that), but it doesn't evaluate as much of the data - it'll potentially even
fail completely what with the recent changes i made :S

...

That looks like the place to start to me...


The one big problem with the qt-based profiler is that the json library 
it uses refuses to work with files over a certain limit, and we easily 
reach that limit. So that's super dumb :(

  - Timo


Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread yary
Actually I would characterize it as

Before:

The programmer had no control over the buffer size, and the user of
the code had no way of adjusting the buffer to a particular system.

Currently:

The programmer has control over the buffer size, and the user of the
code can adjust the buffer to a particular system.


Setting the buffer size is better done by the user, not the
programmer. Often the user and the programmer are one and the same, in
which case, the programmer knows the environment and can set the
environment variables- or change the code- whichever makes better
sense.

If you're writing code for others to use, then the optimal buffer size
isn't known by you. The programmer can either leave it alone and let
the user set an environment variable; or can hard-code it to some
fixed value- for example a multiple of a fixed record size- or can
have code scale the dynamic variable (which is either the default or
set via the user's environment)- for example because the code forks N
times and you've noticed it performs better with a buffer 1/Nth the
usual size (purely hypothetical)

When I hear "Every program that was made under the previous paradigm
now needs to be modified to check the environment to avoid undesired
side effects" what I think is "no, every program that cares can say
INIT $*DEFAULT-READ-ELEMS=65336 thus ignoring the environment. But if
someone gave me a module or program that ignored my wishes, I'd edit
it away."


There are times when you want to ignore the environment - like in Perl
5's taint mode, which if I recall correctly, clears $ENV{PATH} and a
few other things. But in general code uses bits of the environment
because the user wants it that way. If the user is fiddling with
buffer size, then the user knows something or id debugging something
about the system which the programmer didn't need to think about.


Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources

2016-04-01 Thread Timo Paulssen
The profiler's data blob is a massive, gigantic blob of json (ls the 
file and you'll see).


You can easily search the urls to point at local files instead 
of the CDN.


Alternatively, there's a qt-based profiler up on tadzik's github that 
can read the json blob (you will have to 
--profile-filename=blahblah.json to get that), but it doesn't evaluate 
as much of the data - it'll potentially even fail completely what with 
the recent changes i made :S


The biggest contributor to filesize for the profiler is the complexity 
of the call tree. If you can cut out parts and pieces of your program, 
you should be able to profile them individually just fine.


In my experience, firefox is more likely to work with the big profiles.

If anybody is interested in improving our html/js profiler front-end, 
please do speak up and you'll be showered with as much guidance and 
praise as you need :)

  - Timo


perl6 --profile-compile | --profile: both very slow and depend on Internet resources

2016-04-01 Thread Tom Browder
Is there any easy way to get the profilers to use local code (css, js,
etc.) rather than reading across a sometimes slow internet connection?

I'm using both Chrome and Iceweasel with the same effects: slow
loading scripts and always seem to be reading:

https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css;>
https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css;>

Thanks.

Best regards,

-Tom


Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread Jan Ingvoldstad
On Fri, Apr 1, 2016 at 2:00 PM, Elizabeth Mattijsen  wrote:

> Sorry if I wasn’t clear: If there is no dynamic var, it will make one:
> either from the environment, or set it to 64K (like it was before).  So no
> programmer action is ever needed if they’re not interested in that type of
> optimization.
>

That was abundantly clear.


> At the moment it is nothing but a balloon that I let go up in the air.
> Question is still out on whether this will continue to live until the next
> release, or that it will be replaced by something more low level, at the VM
> level.
>
> If you put garbage in the environment, it will die trying to coerce that
> to an integer.
>

Sorry for bringing that up, as it evidently confused the issue.

I'll try to explain the problem once again, with feeling ;) – hoping that
I'm being clearer this time.

Before:

The programmer knows that the buffer size is 64K unless the programmer asks
for something different. A typical Perl program reading buffered input does
not need to worry about anything, unless the programmer wants to have
smaller or larger buffers.

In other words: fire and forget.

Currently:

The programmer does not know what the buffer size is, as it can either be
the default, or set by an environment. Every program that was made under
the previous paradigm now needs to be modified to check the environment to
avoid undesired side effects.

Every future program also needs to include code that checks the environment
to avoid undesired side effects.

-- 
Jan


Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread Elizabeth Mattijsen
> On 01 Apr 2016, at 13:50, Jan Ingvoldstad  wrote:
> 
> On Thu, Mar 31, 2016 at 10:36 AM, Elizabeth Mattijsen  wrote:
> > The reasoning behind _not_ setting things via environment variables, is 
> > that this means the programmer now needs to worry what e.g. the webserver 
> > running the Perl program does, and there are unknown stability (and 
> > possibly security) implications. This adds bloat to the program.
> >
> > The programmer is better off if they only explicitly need to worry about it 
> > when they want to change the defaults.
> 
> The environment variable is only used if there is no dynamic variable found.  
> So, if a programmer wishes to use a specific buffer size in the program, they 
> can.
>  
> This is precisely _not_ addressing the issue I raised.
> 
> This way, the programmer _needs_ to explicitly check whether the environment 
> variable is set, and if not, somehow set a sensible default if the 
> environment variable differs from the default.
> 
> That adds quite a bit of unnecessary programming to each Perl program that 
> deals with buffers.
> The status as it was before, was that the programmer didn't need to worry 
> about the environment for buffer size.

Sorry if I wasn’t clear: If there is no dynamic var, it will make one: either 
from the environment, or set it to 64K (like it was before).  So no programmer 
action is ever needed if they’re not interested in that type of optimization.


> If a malicious environment sets the buffer size to something undesirable, 
> there may be side effects that are hard to predict, and may have other 
> implications than merely performance.
> 
> I think it is preferable that the decision about that is made by the 
> programmer rather than the environment.
> 
> PS: I'm assuming that $*DEFAULT-READ-ELEMS is clean by the time it reaches 
> any code, that is that it only contains _valid_ integer values and cannot 
> lead to overflows or anything, I am not concerned about that.

At the moment it is nothing but a balloon that I let go up in the air.  
Question is still out on whether this will continue to live until the next 
release, or that it will be replaced by something more low level, at the VM 
level.

If you put garbage in the environment, it will die trying to coerce that to an 
integer.


Liz



Re: A practical benchmark shows speed challenges for Perl 6

2016-04-01 Thread Jan Ingvoldstad
On Thu, Mar 31, 2016 at 10:36 AM, Elizabeth Mattijsen 
wrote:

> > The reasoning behind _not_ setting things via environment variables, is
> that this means the programmer now needs to worry what e.g. the webserver
> running the Perl program does, and there are unknown stability (and
> possibly security) implications. This adds bloat to the program.
> >
> > The programmer is better off if they only explicitly need to worry about
> it when they want to change the defaults.
>
> The environment variable is only used if there is no dynamic variable
> found.  So, if a programmer wishes to use a specific buffer size in the
> program, they can.


This is precisely _not_ addressing the issue I raised.

This way, the programmer _needs_ to explicitly check whether the
environment variable is set, and if not, somehow set a sensible default if
the environment variable differs from the default.

That adds quite a bit of unnecessary programming to each Perl program that
deals with buffers.

The status as it was before, was that the programmer didn't need to worry
about the environment for buffer size.

If a malicious environment sets the buffer size to something undesirable,
there may be side effects that are hard to predict, and may have other
implications than merely performance.

I think it is preferable that the decision about that is made by the
programmer rather than the environment.

PS: I'm assuming that $*DEFAULT-READ-ELEMS is clean by the time it reaches
any code, that is that it only contains _valid_ integer values and cannot
lead to overflows or anything, I am not concerned about that.
-- 
Jan