Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources
On Fri, Apr 1, 2016 at 10:08 AM, Tom Browderwrote: > On Fri, Apr 1, 2016 at 9:39 AM, Timo Paulssen wrote: >> The profiler's data blob is a massive, gigantic blob of json (ls the file >> and you'll see). > > Ah, yes: a 2.8+ million character line! ... > What about creating a text output in a format something like gprof? > It looks like tadzik has some code that could be used as a start. And look at something like: https://github.com/open-source-parsers/jsoncpp to use as the JSON C++ library. -Tom
Re: A practical benchmark shows speed challenges for Perl 6
Not sure I understand the disagreement. "the correct buffer size is often per file, not per program/invocation, so a one-size-fits-all envar is the wrong approach"- if you're saying "it would be great to have the buffer size be an option to 'open'," then I agree. It would be nice to have that settable as a parameter. At the moment, you can set it by file by changing $*DEFAULT-READ-ELEMS then creating the filehandle- I read the source, and the handle saves the value it was given. If you change the variable later and open a new handle, the new handle gets the new value while the old keeps its old value. While being able to explicitly set it per handle would be good, the implementation of this dynamic variable is a workable first step. " it is rare for the buffer size to be different based on the system except on systems so small that rakudo isn't going to work on them anyway (e.g. embedded systems)." That doesn't make sense to me, before Elizabeth's patch the buffer was never different on any system, it was always 64K. (And I thought I saw Rakudo on Raspberry Pi in 2014 or '15 but not sure...) 'I can't see this impacting much in the common case" right this isn't addressing common cases. it's an "infrequently asked question." I wasn't even thinking of small systems, but large ones with SAN networks using clever caching over not-quite-properly configured VLANs- like what I use at work. Some of our servers use the SAN through a different VLAN. I don't expect a programmer to fine-tune file-processing code for my situation (unless they want to on-the-fly find the best buffer size each run!), I do appreciate being able to set the buffer in the environment per-system, and I was the original person asking to be able to set the buffer and I have seen it make a difference in this use case. "In the specific case that prompted this thread, it is the programmer that wants to specify a very large buffer." In my case, I was comparing perl5 with C#, and I found that for that particular system, code, and file, a buffer size of 128k was the sweet spot. "is a small system even going to be able to handle that much data to begin with?" Not sure how the existence of small systems eliminates the usefulness of twiddling buffer sizes. -y
Re: A practical benchmark shows speed challenges for Perl 6
On Fri, Apr 1, 2016 at 11:09 AM, yarywrote: > Setting the buffer size is better done by the user, not the > programmer. Often the user and the programmer are one and the same, in > which case, the programmer knows the environment and can set the > environment variables- or change the code- whichever makes better > sense. I would disagree with this; having been in both of those seats, (a) the correct buffer size is often per file, not per program/invocation, so a one-size-fits-all envar is the wrong approach; (b) it is rare for the buffer size to be different based on the system except on systems so small that rakudo isn't going to work on them anyway (e.g. embedded systems). 64k is a rather large buffer size relative to libc stdio which is usually 4-8k, but rather small compared to many other aspects of rakudo's memory usage. I can't see this impacting much in the common case. In the specific case that prompted this thread, it is the programmer that wants to specify a very large buffer. And the fact that a very large buffer is wanted is actually a symptom of an even more significant memory issue: is a small system even going to be able to handle that much data to begin with? So again, the *buffer size* is not the important part of the equation and trying to tune it to reduce the impact on the system is attacking the wrong part of the problem. -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net
Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources
On 01/04/16 17:08, Tom Browder wrote: Alternatively, there's a qt-based profiler up on tadzik's github that can read the json blob (you will have to --profile-filename=blahblah.json to get that), but it doesn't evaluate as much of the data - it'll potentially even fail completely what with the recent changes i made :S ... That looks like the place to start to me... The one big problem with the qt-based profiler is that the json library it uses refuses to work with files over a certain limit, and we easily reach that limit. So that's super dumb :( - Timo
Re: A practical benchmark shows speed challenges for Perl 6
Actually I would characterize it as Before: The programmer had no control over the buffer size, and the user of the code had no way of adjusting the buffer to a particular system. Currently: The programmer has control over the buffer size, and the user of the code can adjust the buffer to a particular system. Setting the buffer size is better done by the user, not the programmer. Often the user and the programmer are one and the same, in which case, the programmer knows the environment and can set the environment variables- or change the code- whichever makes better sense. If you're writing code for others to use, then the optimal buffer size isn't known by you. The programmer can either leave it alone and let the user set an environment variable; or can hard-code it to some fixed value- for example a multiple of a fixed record size- or can have code scale the dynamic variable (which is either the default or set via the user's environment)- for example because the code forks N times and you've noticed it performs better with a buffer 1/Nth the usual size (purely hypothetical) When I hear "Every program that was made under the previous paradigm now needs to be modified to check the environment to avoid undesired side effects" what I think is "no, every program that cares can say INIT $*DEFAULT-READ-ELEMS=65336 thus ignoring the environment. But if someone gave me a module or program that ignored my wishes, I'd edit it away." There are times when you want to ignore the environment - like in Perl 5's taint mode, which if I recall correctly, clears $ENV{PATH} and a few other things. But in general code uses bits of the environment because the user wants it that way. If the user is fiddling with buffer size, then the user knows something or id debugging something about the system which the programmer didn't need to think about.
Re: perl6 --profile-compile | --profile: both very slow and depend on Internet resources
The profiler's data blob is a massive, gigantic blob of json (ls the file and you'll see). You can easily search the urls to point at local files instead of the CDN. Alternatively, there's a qt-based profiler up on tadzik's github that can read the json blob (you will have to --profile-filename=blahblah.json to get that), but it doesn't evaluate as much of the data - it'll potentially even fail completely what with the recent changes i made :S The biggest contributor to filesize for the profiler is the complexity of the call tree. If you can cut out parts and pieces of your program, you should be able to profile them individually just fine. In my experience, firefox is more likely to work with the big profiles. If anybody is interested in improving our html/js profiler front-end, please do speak up and you'll be showered with as much guidance and praise as you need :) - Timo
perl6 --profile-compile | --profile: both very slow and depend on Internet resources
Is there any easy way to get the profilers to use local code (css, js, etc.) rather than reading across a sometimes slow internet connection? I'm using both Chrome and Iceweasel with the same effects: slow loading scripts and always seem to be reading: https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap.min.css;> https://maxcdn.bootstrapcdn.com/bootstrap/3.3.5/css/bootstrap-theme.min.css;> Thanks. Best regards, -Tom
Re: A practical benchmark shows speed challenges for Perl 6
On Fri, Apr 1, 2016 at 2:00 PM, Elizabeth Mattijsenwrote: > Sorry if I wasn’t clear: If there is no dynamic var, it will make one: > either from the environment, or set it to 64K (like it was before). So no > programmer action is ever needed if they’re not interested in that type of > optimization. > That was abundantly clear. > At the moment it is nothing but a balloon that I let go up in the air. > Question is still out on whether this will continue to live until the next > release, or that it will be replaced by something more low level, at the VM > level. > > If you put garbage in the environment, it will die trying to coerce that > to an integer. > Sorry for bringing that up, as it evidently confused the issue. I'll try to explain the problem once again, with feeling ;) – hoping that I'm being clearer this time. Before: The programmer knows that the buffer size is 64K unless the programmer asks for something different. A typical Perl program reading buffered input does not need to worry about anything, unless the programmer wants to have smaller or larger buffers. In other words: fire and forget. Currently: The programmer does not know what the buffer size is, as it can either be the default, or set by an environment. Every program that was made under the previous paradigm now needs to be modified to check the environment to avoid undesired side effects. Every future program also needs to include code that checks the environment to avoid undesired side effects. -- Jan
Re: A practical benchmark shows speed challenges for Perl 6
> On 01 Apr 2016, at 13:50, Jan Ingvoldstadwrote: > > On Thu, Mar 31, 2016 at 10:36 AM, Elizabeth Mattijsen wrote: > > The reasoning behind _not_ setting things via environment variables, is > > that this means the programmer now needs to worry what e.g. the webserver > > running the Perl program does, and there are unknown stability (and > > possibly security) implications. This adds bloat to the program. > > > > The programmer is better off if they only explicitly need to worry about it > > when they want to change the defaults. > > The environment variable is only used if there is no dynamic variable found. > So, if a programmer wishes to use a specific buffer size in the program, they > can. > > This is precisely _not_ addressing the issue I raised. > > This way, the programmer _needs_ to explicitly check whether the environment > variable is set, and if not, somehow set a sensible default if the > environment variable differs from the default. > > That adds quite a bit of unnecessary programming to each Perl program that > deals with buffers. > The status as it was before, was that the programmer didn't need to worry > about the environment for buffer size. Sorry if I wasn’t clear: If there is no dynamic var, it will make one: either from the environment, or set it to 64K (like it was before). So no programmer action is ever needed if they’re not interested in that type of optimization. > If a malicious environment sets the buffer size to something undesirable, > there may be side effects that are hard to predict, and may have other > implications than merely performance. > > I think it is preferable that the decision about that is made by the > programmer rather than the environment. > > PS: I'm assuming that $*DEFAULT-READ-ELEMS is clean by the time it reaches > any code, that is that it only contains _valid_ integer values and cannot > lead to overflows or anything, I am not concerned about that. At the moment it is nothing but a balloon that I let go up in the air. Question is still out on whether this will continue to live until the next release, or that it will be replaced by something more low level, at the VM level. If you put garbage in the environment, it will die trying to coerce that to an integer. Liz
Re: A practical benchmark shows speed challenges for Perl 6
On Thu, Mar 31, 2016 at 10:36 AM, Elizabeth Mattijsenwrote: > > The reasoning behind _not_ setting things via environment variables, is > that this means the programmer now needs to worry what e.g. the webserver > running the Perl program does, and there are unknown stability (and > possibly security) implications. This adds bloat to the program. > > > > The programmer is better off if they only explicitly need to worry about > it when they want to change the defaults. > > The environment variable is only used if there is no dynamic variable > found. So, if a programmer wishes to use a specific buffer size in the > program, they can. This is precisely _not_ addressing the issue I raised. This way, the programmer _needs_ to explicitly check whether the environment variable is set, and if not, somehow set a sensible default if the environment variable differs from the default. That adds quite a bit of unnecessary programming to each Perl program that deals with buffers. The status as it was before, was that the programmer didn't need to worry about the environment for buffer size. If a malicious environment sets the buffer size to something undesirable, there may be side effects that are hard to predict, and may have other implications than merely performance. I think it is preferable that the decision about that is made by the programmer rather than the environment. PS: I'm assuming that $*DEFAULT-READ-ELEMS is clean by the time it reaches any code, that is that it only contains _valid_ integer values and cannot lead to overflows or anything, I am not concerned about that. -- Jan