Hey Ken,

I will echo what Chris says: this is much, much easier if you can write a
10 line program that reproduces the error on your machine. We might be able
to guess at a 10 liner that produces the problem, but then if it doesn't
produce the problem, we're not sure if its not actually a bug, or if we
just guessed wrong.

The implicit type conversion mechanics are built deeply into the PDL::PP
machinery. Usually, PDL::PP will construct code paths for each of the basic
types, and up-convert (i.e. copy to a temporary piddle) any piddles
involved in the operation that are of a "smaller" type. The problem here is
that the default piddle for a Perl scalar is of type Double, so the
machinery heedlessly copies all of your byte data into a temporary Double
piddle before attempting the comparison. 99% of the time, this leads to
behavior that you want and expect, with fairly minimal cost.

I don't think, however, that this is a bug in PDL. It is a wart, for sure,
but I think that this is the cost of using the tool. Of course, we could
have PDL::PP generate one code path for each possible input type, but that
would mean generating 7**2 code paths for a function that takes two inputs,
7**3 code paths for a function that takes three inputs, and so on. This
would lead to PDL-based libraries that are enormous, and which would blow
your CPU's code cache so often it would noticeably slow down PDL's
operation. An alternative would be to change how PDL::PP functions are
generated to perform data type conversions in much smaller batches. This
second solution would be a major reprogramming effort, not to be taken
lightly: we would have to rewrite the tool, and we would have to rewrite
the PDL methods to make use if the new capability. I've thought about it,
for sure, but I don't have the time to put into that effort at the moment.

I hope that gives you some context on why this is a problem. Like Chris
said, the sumover upconverts any integer sum into a data type that has a
chance of performing the operation, a solution that fixes a more common
problem. It might be possible to create a similar sumover that does not
perform the upconversion, which would be useful for folks in your
situation. If you give a short code snippet, I might be able to quickly
hack a Inline::Pdlpp solution for you. :-)

Hope that helps!
David

On Fri, Nov 21, 2014 at 2:30 PM, Chris Marshall <devel.chm...@gmail.com>
wrote:

> I think we've tracked this issue down as coming from various
> implicit type conversions that take place in processing the
> computations.  It appears that sumover() of an integer type
> uses the largest integer type available to compute, presumably
> to avoid overflow.
>
> Similarly, a numerical comparison between a piddle and a perl
> scalar turns into a piddle versus a double piddle which gets
> promoted to a double.  It seems there is room to clean up the
> type conversion (or add some back conversions) to these ops.
>
> In the meantime, if you cast the sumover results to byte along
> with the results of the comparison you might be able to get
> things running.  Definitely something to fix.
>
> (This is *much* easier to debug if you would generate
> a small test program that illustrates the problem rather than
> talking about code snippets that won't work.... saves time
> and ensures I'm running *exactly* what you are running.)
>
> Cheers,
> Chris
>
>
> On Fri, Nov 21, 2014 at 2:02 PM, LYONS, KENNETH B (KENNETH)
> <k...@research.att.com> wrote:
> > Uname doesn't give the amount of memory: 2.6.9-103.ELsmp #1 SMP Fri Dec
> 9 04:43:08 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
> >
> > But /usr/bin/top shows it's 4 GB (which I think is buried down in the
> thread below somewhere).
> >
> > I set up some code to reproduce it, and discovered what the problem
> was.  The simple operation using operands that were explicitly generated as
> byte worked fine.  The operand $present, in what I had shown you before,
> though, was generated by a line like this (where $sigs is a byte PDL):
> >         $present = ((sumover($sigs>0)->xchg(0,1)->sumover) >
> ($ndays*0.9);
> > And, it seems that it winds up as type *double* because of the
> *comparison* to a float value!
> >
> > If I replace ($ndays*0.9) with "int($ndays*0.9)", then it generates the
> operand as "Long D".  Which also fails, of course.
> >
> > This behavior is something I would regard as a bug, since I'd expect the
> result to be of type byte, if it started that way, and the only operation
> was a comparison, but the folks who have specified the language have to
> decide how they want it to work.
> >
> > But operationally, is there a way to force that intermediate to stay as
> byte?
> >
> > I also tried this, by the way, to see if using a byte PDL as the
> comparison operand would change things:
> >
> > $test = zeros byte, 3;
> > $present = ((sumover($sigs)>0)->xchg(0,1)->sumover) > $test  # now the
> operand is explicitly of type byte!
> >
> > And the result is again Long D.  Now that really looks like a bug to me.
> >
> > It turns out that even just "sumover($sigs)>0" produces a result that is
> "Long D."  Is there any way to control that behavior?
> >
> > Ken
> >
> >
> > -----Original Message-----
> > From: Chris Marshall [mailto:devel.chm...@gmail.com]
> > Sent: Friday, November 21, 2014 1:17 PM
> > To: LYONS, KENNETH B (KENNETH)
> > Cc: Derek Lamb; perldl
> > Subject: Re: [Perldl] matching vectors inside a PDL
> >
> > Ken-
> >
> > You should also have pdl2 on your system as
> > well.  If you have the needed prerequisite module
> > Devel::REPL installed, then you pdl2 will give you
> > the new PDL shell, otherwise, it falls back to the
> > perldl shell transparently.
> >
> > I don't see anything funny so the next step is to get
> > a short code snippet that reproduces the problem
> > and error for us to try to reproduce.
> >
> > If this is a bug, I would like to fix it but we'll need to
> > be able to reproduce the problem.  Please send
> > information on your system: uname -a output, amount
> > of memory,...
> >
> > --Chris
> >
> > On Fri, Nov 21, 2014 at 1:09 PM, LYONS, KENNETH B (KENNETH)
> > <k...@research.att.com> wrote:
> >> Chris
> >>
> >> I didn't know perldl was on my system!  It got squirreled away in the
> perl directory, outside my path.  I found it with locate.   Here's the
> output:
> >>
> >> Summary of my PDL configuration
> >>
> >> VERSION: PDL v2.007 (supports bad values)
> >>
> >> $%PDL::Config = {
> >>                   'BADVAL_PER_PDL' => '0',
> >>                   'WITH_PROJ' => '0',
> >>                   'PDL_CONFIG_VERSION' => '0.005',
> >>                   'POSIX_THREADS_INC' => undef,
> >>                   'FFTW_TYPE' => 'double',
> >>                   'PDL_BUILD_DIR' => '/home/vip/.cpan/build/PDL-2.007',
> >>                   'FFTW_LIBS' => undef,
> >>                   'WITH_FFTW' => '0',
> >>                   'GSL_LIBS' => undef,
> >>                   'WITH_IO_BROWSER' => '0',
> >>                   'PROJ_INC' => undef,
> >>                   'WHERE_PLPLOT_INCLUDE' => undef,
> >>                   'HTML_DOCS' => '1',
> >>                   'SKIP_KNOWN_PROBLEMS' => '0',
> >>                   'WHERE_PLPLOT_LIBS' => undef,
> >>                   'WITH_3D' => '0',
> >>                   'WITH_POSIX_THREADS' => '1',
> >>                   'POGL_VERSION' => '0.6702',
> >>                   'FFTW_INC' => undef,
> >>                   'HIDE_TRYLINK' => '1',
> >>                   'HDF_INC' => undef,
> >>                   'WITH_HDF' => '0',
> >>                   'POGL_WINDOW_TYPE' => 'glut',
> >>                   'WITH_GD' => '1',
> >>                   'WITH_BADVAL' => '1',
> >>                   'FITS_LEGACY' => '1',
> >>                   'WITH_SLATEC' => '0',
> >>                   'BADVAL_USENAN' => '0',
> >>                   'WITH_DEVEL_REPL' => '1',
> >>                   'TEMPDIR' => '/tmp',
> >>                   'PROJ_LIBS' => undef,
> >>                   'USE_POGL' => '0',
> >>                   'PDL_BUILD_VERSION' => '2.007',
> >>                   'GD_LIBS' => undef,
> >>                   'GSL_INC' => undef,
> >>                   'GD_INC' => undef,
> >>                   'WITH_GSL' => '0',
> >>                   'OPTIMIZE' => undef,
> >>                   'PDLDOC_IGNORE_AUTOLOADER' => '0',
> >>                   'HDF_LIBS' => undef,
> >>                   'POSIX_THREADS_LIBS' => undef,
> >>                   'MALLOCDBG' => {},
> >>                   'WITH_MINUIT' => '0',
> >>                   'WITH_PLPLOT' => '0',
> >>                   'MINUIT_LIB' => undef
> >>                 };
> >> Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
> >>   Platform:
> >>     osname=linux, osvers=2.6.9-55.0.2.elsmp, archname=x86_64-linux
> >>     uname='linux x.removed.by.me 2.6.9-55.0.2.elsmp #1 smp tue jun 26
> 14:14:47 edt 2007 x86_64 x86_64 x86_64 gnulinux '
> >>     config_args=''
> >>     hint=recommended, useposix=true, d_sigaction=define
> >>     usethreads=undef use5005threads=undef useithreads=undef
> usemultiplicity=undef
> >>     useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
> >>     use64bitint=define use64bitall=define uselongdouble=undef
> >>     usemymalloc=n, bincompat5005=undef
> >>   Compiler:
> >>     cc='cc', ccflags ='-fno-strict-aliasing -pipe
> -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE
> -D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
> >>     optimize='-O2',
> >>     cppflags='-fno-strict-aliasing -pipe -Wdeclaration-after-statement
> -I/usr/local/include -I/usr/include/gdbm'
> >>     ccversion='', gccversion='3.4.6 20060404 (Red Hat 3.4.6-8)',
> gccosandvers=''
> >>     intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
> >>     d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
> >>     ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
> lseeksize=8
> >>     alignbytes=8, prototype=define
> >>   Linker and Libraries:
> >>     ld='cc', ldflags =' -L/usr/local/lib'
> >>     libpth=/usr/local/lib /lib /usr/lib
> >>     libs=-lnsl -ldl -lm -lcrypt -lutil -lc
> >>     perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
> >>     libc=/lib/libc-2.3.4.so, so=so, useshrplib=false, libperl=libperl.a
> >>     gnulibc_version='2.3.4'
> >>   Dynamic Linking:
> >>     dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
> >>     cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'
> >>
> >> -----Original Message-----
> >> From: Chris Marshall [mailto:devel.chm...@gmail.com]
> >> Sent: Friday, November 21, 2014 9:35 AM
> >> To: LYONS, KENNETH B (KENNETH)
> >> Cc: Derek Lamb; perldl
> >> Subject: Re: [Perldl] matching vectors inside a PDL
> >>
> >> Hi Ken-
> >>
> >> I am unable to generate the error with PDL-2.007
> >> either.  My system has 8GiB of memory and the
> >> PDL build is using the 64bit index support.
> >>
> >> What are the specs of your linux box and could
> >> you please send the output of the 'perldl -V'
> >> command.  If you built PDL from sources, then
> >> the build log should indicate whether 64bit index
> >> support was enabled.  If you $Config{ivsize} is
> >> 8 then you should have 64bit support as well.
> >>
> >> --Chris
> >>
> >>
> >> On Fri, Nov 21, 2014 at 8:02 AM, Chris Marshall <devel.chm...@gmail.com>
> wrote:
> >>> Ken-
> >>>
> >>> If you could make a short script that generates the
> >>> problem along with the output/error messages, that
> >>> would help.
> >>>
> >>> Do you have $PDL::BIGPDL set?  Might try with
> >>> that set to 1.
> >>>
> >>> I'll try the problem code on PDL-2.007 to see if that
> >>> is the reason for the differences.
> >>>
> >>> --Chris
> >>>
> >>>
> >>>
> >>> On Thu, Nov 20, 2014 at 6:18 PM, LYONS, KENNETH B (KENNETH)
> >>> <k...@research.att.com> wrote:
> >>>> Chris
> >>>> I'm running perl 5.8.8 on a rather old linux system.  I installed the
> perl modules rather recently from the PDL site, so I'd expect they are up
> to date with whatever is there.  From the names of the files, I'd say it's
> 2.007.
> >>>>
> >>>> I've tried a variety of ways of using the inplace method, and none of
> them produced a perl error akin to what you got below.  The errors were
> coming out of the PDL module itself, complaining about the size of the
> piddle being over 1GB.  Given the dimensions of the piddle that is being
> calculated (around 200 MB), that shouldn't have happened--unless it's using
> doubles, which would make it ~1.6 GB.  Like I said, I got around the
> problem in kind of a hack, by just slicing things up 20K rows at a
> time--but I'd really like to find a way to do it right!
> >>>>
> >>>> ...snip...
>
> _______________________________________________
> Perldl mailing list
> Perldl@jach.hawaii.edu
> http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
>



-- 
 "Debugging is twice as hard as writing the code in the first place.
  Therefore, if you write the code as cleverly as possible, you are,
  by definition, not smart enough to debug it." -- Brian Kernighan
_______________________________________________
Perldl mailing list
Perldl@jach.hawaii.edu
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

Reply via email to