subject:"Data structure question"

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread John Hughes


 And indeed, they ought to die. Or be reimplemented. Or something, 
 but quite simply, don't use them. They'll break, they won't dwim,
 and chances are they won't play nice with future/past versions of
 Perl. Forget they even exist.

Details?

I'm using them with no problems in 5.005_03 (the real "last stable"
version) with no problems.

exists doesn't do what you think, that's the list of problems.

-- 
John Hughes [EMAIL PROTECTED], 
CalvaEDI SA.Tel: +33-1-4313-3131
66 rue du Moulin de la Pointe,  Fax: +33-1-4313-3139
75013 PARIS.

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread John Hughes


  I had already reached the same conclusion after I saw that
 everyone would have to remember to say "my Dog $spot;" every time or the
 whole thing falls apart.

Falls apart?  How?

 If you want something reasonably close, you could do what a lot of the
 Template Toolkit code does and use arrays with constants for key
 names.  Here's an example:

Yes but then you get neither compile time (my Dog $spot) nor run time
(my $spot) error checking.

How are you going to debug the times you use a constant defined for
one structure to index another?

Have fun.

Oh, do it all through accessor functions.  That'll be nice and
fast won't it.

-- 
John Hughes [EMAIL PROTECTED], 
CalvaEDI SA.Tel: +33-1-4313-3131
66 rue du Moulin de la Pointe,  Fax: +33-1-4313-3139
75013 PARIS.

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Matt Sergeant


On Tue, 23 Jan 2001, John Hughes wrote:

  And indeed, they ought to die. Or be reimplemented. Or something, 
  but quite simply, don't use them. They'll break, they won't dwim,
  and chances are they won't play nice with future/past versions of
  Perl. Forget they even exist.
 
 Details?
 
 I'm using them with no problems in 5.005_03 (the real "last stable"
 version) with no problems.
 
 exists doesn't do what you think, that's the list of problems.

Neither does delete. And overloading doesn't really work properly. And
reloading modules with phashes doesn't work right. And sub-hashes doesn't
work right ($pseudo-{Hash}{SubHash}). And so on...

All they do is hide a multitude of sins, for very little real world
gain. Try it - convert your app back to non-pseudo hashes and see what
performance you lose. I'm willing to bet its not a lot.

The only gain might be in a large DOM tree where there may be thousands of
objects. But then you're really better off using an array based class
instead (as I found out).

-- 
Matt/

/||** Director and CTO **
   //||**  AxKit.com Ltd   **  ** XML Application Serving **
  // ||** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** Personal Web Site: http://sergeant.org/ **
 \\//
 //\\
//  \\

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Matt Sergeant


On Tue, 23 Jan 2001, John Hughes wrote:

   I had already reached the same conclusion after I saw that
  everyone would have to remember to say "my Dog $spot;" every time or the
  whole thing falls apart.
 
 Falls apart?  How?

Because you miss one out and its a very difficult to find bug in your
application, mostly because you don't get the compile warnings if you miss
one off, but also you end up wasting time looking for why your application
really isn't any faster (the hint here is that pseudo hashes really don't
make that much speed difference to your application).

Say you miss off a type declaration, and later decide to change your hash
key. All of the declarations with types will produce compile errors, so
you can/will fix them, but the one you missed it from will lie hidden,
never producing an error even when the code is called.

  If you want something reasonably close, you could do what a lot of the
  Template Toolkit code does and use arrays with constants for key
  names.  Here's an example:
 
 Yes but then you get neither compile time (my Dog $spot) nor run time
 (my $spot) error checking.

Why not?

Witness:

% perl -Mstrict
use constant FOO = 0;
my @array;
$array[FOD] = 3;
Bareword "FOD" not allowed while "strict subs" in use at - line 3.

Seems like compile time checking to me...

 How are you going to debug the times you use a constant defined for
 one structure to index another?

You use packages, and data hiding.

 Oh, do it all through accessor functions.  That'll be nice and
 fast won't it.

Maybe faster than you think. Your bottleneck is elsewhere.

If you are really going: 

my Dog $spot = Dog-new("spot");
print "My Dog's name is: ", $spot-{Name}, "\n";

Then I think many people here would think that is a very bad
technique. You should *never* be able to make assumptions about the
underlying data format of an object.

-- 
Matt/

/||** Director and CTO **
   //||**  AxKit.com Ltd   **  ** XML Application Serving **
  // ||** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** Personal Web Site: http://sergeant.org/ **
 \\//
 //\\
//  \\

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread John Hughes


(exists doesn't work).

 Neither does delete.

Ok.  But what should it do?  What does it do for an array?

 And overloading doesn't really work properly.

Details?

 And reloading modules with phashes doesn't work right.

I steer clear of reloading, almost anything screws up.

 And sub-hashes doesn't work right ($pseudo-{Hash}{SubHash}).

Details?  Works for me.

 And so on...

 All they do is hide a multitude of sins, for very little real world
 gain. Try it - convert your app back to non-pseudo hashes and see what
 performance you lose. I'm willing to bet its not a lot.

Well, obviously.  Hashes aren't slow.  But they are *BIG*.

-- 
John Hughes [EMAIL PROTECTED], 
CalvaEDI SA.Tel: +33-1-4313-3131
66 rue du Moulin de la Pointe,  Fax: +33-1-4313-3139
75013 PARIS.

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Matt Sergeant


On Tue, 23 Jan 2001, John Hughes wrote:

 (exists doesn't work).
 
  Neither does delete.
 
 Ok.  But what should it do?  What does it do for an array?

But we're talking about hashes! At the very least it should make it so
that exists() returns false.

  And overloading doesn't really work properly.
 
 Details?

Overloading was the wrong word, FWIW... What I meant was, it doesn't work
right if you subclass a module using @ISA = (...) rather than use base. So
everybody has to *know* the underlying implementation of your class
anyway, so that breaks the very concept of OO/Data Hiding.

  And reloading modules with phashes doesn't work right.
 
 I steer clear of reloading, almost anything screws up.

Thats an overstatement in the extreme. Reloading works fine for a great
many people, and most modules.

  And sub-hashes doesn't work right ($pseudo-{Hash}{SubHash}).
 
 Details?  Works for me.

SubHash isn't compile time checked! You need to do:

my SubH $subhash = $pseudo-{Hash};
$subhash-{SubHash};

to get the compile time checking.

  All they do is hide a multitude of sins, for very little real world
  gain. Try it - convert your app back to non-pseudo hashes and see what
  performance you lose. I'm willing to bet its not a lot.
 
 Well, obviously.  Hashes aren't slow.  But they are *BIG*.

??? How many keys are in your pseudo hashes? I'm willing to bet not that
many. The difference is probably less than you think to your particular
application. That is unless its a huge set of objects (thousands).

-- 
Matt/

/||** Director and CTO **
   //||**  AxKit.com Ltd   **  ** XML Application Serving **
  // ||** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** Personal Web Site: http://sergeant.org/ **
 \\//
 //\\
//  \\

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Robin Berjon


At 11:36 23/01/2001 +0100, John Hughes wrote:
 Neither does delete.

Ok.  But what should it do?  What does it do for an array?

perldoc -f delete

"In the case of an array, if the array elements happen to be at the end,
the size of the array will shrink to the highest element that tests true
for exists() (or 0 if no such element exists)."

Pretty much what one would expect.

 All they do is hide a multitude of sins, for very little real world
 gain. Try it - convert your app back to non-pseudo hashes and see what
 performance you lose. I'm willing to bet its not a lot.

Well, obviously.  Hashes aren't slow.  But they are *BIG*.

That's why arrays are so cool. And there are many tricks to make them work
pretty much the way you'd expect a hash to work, with very few limitations.
I also have a mind to try and play with use overload '%{}' on an array
based object to see if interesting stuff could be done there. It'll be
slower of course, but it could perhaps beat a tied hash (ties asre awfully
slow).

-- robin b.
We are born naked, wet and hungry . Then things get worse.

Re: pseudo-hashes? was: Data structure question

2001-01-23 Thread DeWitt Clinton


On Tue, Jan 23, 2001 at 10:06:13AM +, Matt Sergeant wrote:

 The only gain might be in a large DOM tree where there may be
 thousands of objects. But then you're really better off using an
 array based class instead (as I found out).

This is getting a bit off-topic, but I'm empirically found that the
DOM is not necessarily the best object model to use in a mod_perl
environment.  XML::DOM in particular has such a high overhead in terms
of memory (and memory leaks) and performance, that it is sometimes
inappropriate for a context that requires a small footprint, and
generally fast throughput (like mod_perl).

For example, in version 1 of the Avacet perl libraries, we were using
XML::DOM for both our XML-RPC mechanism and as the underlying data
structure for object manipulation.  In version 2, however, we created
an architecture that automatically converts between the language
agnostic XML and native blessed objects using a custom engine built on
the PerlSAX parser.  This reduced our memory footprint dramatically,
stopped up the memory leaks, and increased performance significantly.
Moreover, the object model now exposed is based on native perl objects
with an API geared toward property manipulation (i.e., get_foo,
set_foo) which is easier to program directly to than the DOM.

You can see this in action with the modules available in the
Avacet::Core::Rpc::Xml namespace at www.avacet.com.  

Best regards,

-DeWitt

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Matt Sergeant


On Tue, 23 Jan 2001, Robin Berjon wrote:

 At 11:36 23/01/2001 +0100, John Hughes wrote:
  Neither does delete.
 
 Ok.  But what should it do?  What does it do for an array?

 perldoc -f delete

 "In the case of an array, if the array elements happen to be at the end,
 the size of the array will shrink to the highest element that tests true
 for exists() (or 0 if no such element exists)."

 Pretty much what one would expect.

Thats only 5.6+ though. So its only useful for internal applications (if
at all).

-- 
Matt/

/||** Director and CTO **
   //||**  AxKit.com Ltd   **  ** XML Application Serving **
  // ||** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** Personal Web Site: http://sergeant.org/ **
 \\//
 //\\
//  \\

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Robin Berjon


At 12:50 23/01/2001 +, Matt Sergeant wrote:
Thats only 5.6+ though. So its only useful for internal applications (if
at all).

True, but we've been using 5.6 (built from AS source) in production for
quite a while now very happily. Also, I'm seeing more and more customers
having it or ready to upgrade. Doesn't make delete @array that much more
useful, but there's hope.

-- robin b.
Designing pages in HTML is like having sex in a bathtub. If you don't know
anything about sex, it won't do you any good to know a lot about bathtubs.

RE: pseudo-hashes? was: Data structure question

2001-01-23 Thread Perrin Harkins


On Tue, 23 Jan 2001, John Hughes wrote:
   I had already reached the same conclusion after I saw that
  everyone would have to remember to say "my Dog $spot;" every time or the
  whole thing falls apart.
 
 Falls apart?  How?

If you forget the "Dog" part somewhere, it's slower than a normal hash.

  If you want something reasonably close, you could do what a lot of the
  Template Toolkit code does and use arrays with constants for key
  names.  Here's an example:
 
 Yes but then you get neither compile time (my Dog $spot) nor run time
 (my $spot) error checking.

As Matt pointed out, you get compile time errors if you use an undefined
constant as a key.

You can also do this sort of thing with hashes, like this:

use strict;
my $bar = 'bar'
$foo{$bar};

If you type $foo{$barf} instead, you'll get an error.

 How are you going to debug the times you use a constant defined for
 one structure to index another?

Different classes would be in different packages.

 Oh, do it all through accessor functions.  That'll be nice and
 fast won't it.

Well, I thought we were talking about data structures to use for objects.

A few months back, when making design decisions for a big project, I
benchmarked pseudo-hashes on 5.00503.  They weren't significantly faster
than hashes, and only 15% smaller.  I figured they were only worth the
trouble if we were going to be making thousands of small objects, which is
a bad idea in the first place.  So, we opted for programmer efficiency and
code readability and wrote hashes when we meant hashes.  Of course, since
this stuff is OO code, we could always go back and change the internal
implementation to pseudo-hashes if it looked like it would help.

If pseudo-hashes work for you, go ahead and use them.  If it ain't
broke...

- Perrin

pseudo-hashes? was: Data structure question

2001-01-22 Thread Tom_Roche


Until reading Conway's "Object Oriented Perl"

http://www.manning.com/Conway/

(section 4.3, pp 126-135) I hadn't heard about pseudo-hashes. I now
desire a data structure with non-numeric keys, definable iteration
order, no autovivification, and happy syntax. (And, of course,
fast-n-small :-) Having Conway's blessing is nice, and perldelta for
5.6 says "Pseudo-hashes work better" (with details). But it also says

http://perldoc.com/perl5.6/pod/perldelta.html
 NOTE: The pseudo-hash data type continues to be experimental.
   Limiting oneself to the interface elements provided by the
   fields pragma will provide protection from any future changes

In addition to such faint praise, I'm also seeing damnations, such as
the Perl6 RPC "Pseudo-hashes must die!" and

Matt Sergeant [EMAIL PROTECTED] Thu, 8 Jun 2000 15:44:04 +0100 (BST)
 Psuedo hash references are badly broken even in 5.6. Anyone who's
 done extensive work with them (or tried to) can tell you that.

Which deters. (As does

 Instead, write a class for your objects, and use arrays internally.
 Define constants for the indexes of the arrays.

which appears laziness-deficient :-)

I'm also _not_ seeing messages of the form, "Yes, we used phashs to
implement our telepathic subsystem, which services 4.2 zillion users
every day. We love them."

Being an empiricist (and a wimp :-), I'd like to know:

* Is anyone out there using pseudo-hashes in production code under
  mod_perl?

* Is anyone now using (under mod_perl) something they consider to be
  superior but with similar functionality and interface?

If possible reply directly to me as well as the list (I'm digesting),
and TIA, [EMAIL PROTECTED]

Re: pseudo-hashes? was: Data structure question

2001-01-22 Thread Matt Sergeant


On Mon, 22 Jan 2001, [EMAIL PROTECTED] wrote:

Well you've already seen I'm a detractor :-)

 * Is anyone now using (under mod_perl) something they consider to be
   superior but with similar functionality and interface?

Yes, a class which is a blessed array.

-- 
Matt/

/||** Director and CTO **
   //||**  AxKit.com Ltd   **  ** XML Application Serving **
  // ||** http://axkit.org **  ** XSLT, XPathScript, XSP  **
 // \\| // ** Personal Web Site: http://sergeant.org/ **
 \\//
 //\\
//  \\

Re: pseudo-hashes? was: Data structure question

2001-01-22 Thread Robin Berjon


At 18:05 22/01/2001 -0500, [EMAIL PROTECTED] wrote:
the Perl6 RPC "Pseudo-hashes must die!" and

And indeed, they ought to die. Or be reimplemented. Or something, but quite
simply, don't use them. They'll break, they won't dwim, and chances are
they won't play nice with future/past versions of Perl. Forget they even exist.

As Matt says, array based objects are much much better, and do what you
want them to do. You seem to be deterred by the laziness factor. Not so
much of a problem ! You could use enum, but it has constraints on the names
you can use which I don't like. You also probably don't need all that it does.

Following is a small class that I've been using in one of my projects. You
can use it in two ways:

If you are not extending a class that uses an array based object, simply
define the fields and use them:

package MyClass;
BEGIN { use Tessera::Util::Enum qw(FOO BAR BAZ); }

sub new {
  my $class = shift;
  return bless [], $class;
}

sub foo {
  my $self = shift;
  return $self-[FOO]; # fetch what's at index FOO
}

Sounds simple enough right ? The problem with array based objects is that
generally they can't be extended. That is, if you have a subclass of an
array based class it's a pain to add new fields because you never know if
your base class might add new fields, and thus break your index. That's one
reason why hashes are still used so much. With my class, you can do (in
your subclass):

package MyClass::Subclass;
use base MyClass;
use Tessera::Util::Enum;

BEGIN { 
  Tessera::Util::Enum-extend(
 class  = 'MyClass',
 with  = [qw(
 NEW_FIELD
 OTHER_NEW
   )],
   );
}

sub get_new_field {
  my $self = shift;
  return $self-[NEW_FIELD];
}

and it will just work. One thing you can't have is multiple inheritance
(well, you can choose to extend just one of the parent classes). I've been
using this quite extensively in a system of mine, and I've been quite happy
with it. It does reduce memory usage in a DOM2 implementation of mine. Of
course, you can change the class name as it won't mean anything outside my
framework, tweak it, throw it out the window, etc...

Perhaps I should put it on CPAN if there's interest in such things (and no
such module is already there).

###
# Tessera Enum Class
# Robin Berjon [EMAIL PROTECTED]
# 03/11/2000 - prototype mark V
###

package Tessera::Util::Enum;
use strict;
no strict 'refs';
use vars qw($VERSION %packages);
$VERSION = '0.01';

#-#
# import()
#-#
sub import {
my $class = shift;
@_ or return;
my $pkg = caller();

my $idx = 0;
for my $enum (@_) {
*{$pkg . '::' . $enum} = eval "sub () { $idx }";
$idx++;
}
$packages{$pkg} = $idx; # this is the idx of the next field
}
#-#


#-#
# extend(class = 'class', with = \@ra_fieldnames)
#-#
sub extend {
my $class = shift;
my %options = @_;
my $pkg = caller();

warn "extending a class ($options{class}) that hasn't yet been defined"
unless $options{class};

my $idx = $packages{$options{class}};
for my $enum (@{$options{with}}) {
*{$pkg . '::' . $enum} = eval "sub () { $idx }";
$idx++;
}
$packages{$pkg} = $idx; # this is the idx of the next field
}
#-#



1;
=pod

=head1 NAME

Tessera::Util::Enum - very simple enums

=head1 SYNOPSIS

  use Tessera::Util::Enum qw(
  _foo_
  BAR
  baz_gum
);

  or

  use Tessera::Util::Enum ();
  Tessera::Util::Enum-extend(
  class = 'Some::Class',
  with  = [qw(
more_foo_
OTHER_BAR
   )],
 );

=head1 DESCRIPTION

This class only exists because enum.pm has restrictions on naming
that I don't like. I also don't need it's entire power.

It also adds the possibility to extend a class that already uses
Enum to define it's fields. We will start at that index.

=head1 AUTHOR

Robin Berjon [EMAIL PROTECTED]

This module is licensed under the same terms as Perl itself.

=cut

-- robin b.
As a computer, I find your faith in technology amusing.

Re: pseudo-hashes? was: Data structure question

2001-01-22 Thread Perrin Harkins


On Mon, 22 Jan 2001 [EMAIL PROTECTED] wrote:
 (section 4.3, pp 126-135) I hadn't heard about pseudo-hashes. I now
 desire a data structure with non-numeric keys, definable iteration
 order, no autovivification, and happy syntax. (And, of course,
 fast-n-small :-) Having Conway's blessing is nice

Pseudo-hashes do not have Conway's blessing.  We hired him to do a
tutorial for our engineers a few omnths back, and he railed about how
disappointing pseudo-hashes turned out to be and why no one should ever
use them.  I had already reached the same conclusion after I saw that
everyone would have to remember to say "my Dog $spot;" every time or the
whole thing falls apart.

If you want something reasonably close, you could do what a lot of the
Template Toolkit code does and use arrays with constants for key
names.  Here's an example:

package Dog;

use constant NAME = 1;
use constant ID   = 2;

sub new {
  my $self = [];
  $self-[ NAME ] = 'spot';
  $self-[ ID ]   = 7; 
  return bless $self;
}

Or something like that, and make accessors for the member data.  I think
there are CPAN modules which can automate this for you if you wish.

- Perrin

Re: pseudo-hashes? was: Data structure question

2001-01-22 Thread Ken Williams


[EMAIL PROTECTED] (Perrin Harkins) wrote:
On Mon, 22 Jan 2001 [EMAIL PROTECTED] wrote:
 (section 4.3, pp 126-135) I hadn't heard about pseudo-hashes. I now
 desire a data structure with non-numeric keys, definable iteration
 order, no autovivification, and happy syntax. (And, of course,
 fast-n-small :-) Having Conway's blessing is nice

Pseudo-hashes do not have Conway's blessing.  We hired him to do a
tutorial for our engineers a few omnths back, and he railed about how
disappointing pseudo-hashes turned out to be and why no one should ever
use them.  I had already reached the same conclusion after I saw that
everyone would have to remember to say "my Dog $spot;" every time or the
whole thing falls apart.

At the last YAPC he talked about the various unsatisfactory approaches
and finally seemed to advocate for his Tie::SecureHash module.  Among
other things, it allows '__private', '_protected', and 'public' data
members.  I'm not sure whether it supports explicit declarations of key
names, but I bet it could be added easily if not.

I haven't used the module, but wanted to pass along the info.


  ------
  Ken Williams Last Bastion of Euclidity
  [EMAIL PROTECTED]The Math Forum

Re: Data structure question

2000-06-08 Thread Stephen Zander


 "Drew" == Drew Taylor [EMAIL PROTECTED] writes:
Drew I would like to return a single data structure, but order IS
Drew important (hence the current setup). I was thinking of using
Drew an array, where each element is a hash reference. So I would
Drew return something like this:

In this case pseudohashes are absolutely what you're looking for.
They'll also have the smallest impact on your code as you can walk
@{$ref}[1..foo] when you need the items in order and grab $ref-{key}
when you need a particular value.  Just remember that $ref-[0] is
special.

-- 
Stephen

"Farcical aquatic ceremonies are no basis for a system of government!"

Re: Data structure question

2000-06-08 Thread Matt Sergeant


On 8 Jun 2000, Stephen Zander wrote:

  "Drew" == Drew Taylor [EMAIL PROTECTED] writes:
 Drew I would like to return a single data structure, but order IS
 Drew important (hence the current setup). I was thinking of using
 Drew an array, where each element is a hash reference. So I would
 Drew return something like this:
 
 In this case pseudohashes are absolutely what you're looking for.
 They'll also have the smallest impact on your code as you can walk
 @{$ref}[1..foo] when you need the items in order and grab $ref-{key}
 when you need a particular value.  Just remember that $ref-[0] is
 special.

Ugh. Psuedo hash references are badly broken even in 5.6. Anyone who's
done extensive work with them (or tried to) can tell you that.

Instead, write a class for your objects, and use arrays internally. Define
constants for the indexes of the arrays.

-- 
Matt/

Fastnet Software Ltd. High Performance Web Specialists
Providing mod_perl, XML, Sybase and Oracle solutions
Email for training and consultancy availability.
http://sergeant.org http://xml.sergeant.org

Data structure question

2000-06-06 Thread Drew Taylor


Hello,

This doesn't directly relate to mod_perl, but I'd like to make this as
memory efficient as possible since it runs under mod_perl. :-) 

I have a question about data structures. Currently, I am doing SQL
queries and returning an array ref and a hash ref. The array is to
preserve order, and the hash contains various bits of data about that
particular product ( it could be another hash ref, and often is). After
getting the two references, I use foreach to loop through the array, and
within that loop I access various data from the hash where the productID
is the key. It looks like this:

Common.pm:
sub getdata {
   my $AR = [123, 456, 243, ... ]
   my $HR = { 123 = {foo=bar, name='name', price='123'}, ... }
   return ($AR, $HR);
}

Otherstuff.pm:
my ($AR, $HR) = $self-getdata();
foreach (@{$AR}) {
   my $name = $HR-{$_}{name};
   ...
}

I would like to return a single data structure, but order IS important
(hence the current setup). I was thinking of using an array, where each
element is a hash reference. So I would return something like this:

[ {ID=123, name='name123', foor='bar'},  {ID=321, name='name321',
bar='foo'}, ... ]

Are there any de-referenceing issues (performance wise) that would make
this less efficient than the 2 structures? TIA for any pointers.

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Ken Y. Clark


On Tue, 6 Jun 2000, Drew Taylor wrote:

 I have a question about data structures. Currently, I am doing SQL
 queries and returning an array ref and a hash ref. The array is to
 preserve order, and the hash contains various bits of data about that

not to be dense, but can't you just issue an "order by" clause as part of
your SQL statement?  

ky

Re: Data structure question

2000-06-06 Thread Gunther Birznieks


Using tied hashes, you could conceivably make your own ordered hash class 
and use that as the data structure you return. You'd still basically have 
two data structures (for performance) but the fact that it is two data 
structures would be hidden behind the tied hash which would be programmed 
to iterate the keys using the array rather than the keys function on the 
hash part.

I think there is source code for this publicly available, but I forget 
where I saw it. You can get some docs from perldoc perltie though.

At 12:39 PM 6/6/00 -0400, Drew Taylor wrote:
Hello,

This doesn't directly relate to mod_perl, but I'd like to make this as
memory efficient as possible since it runs under mod_perl. :-)

I have a question about data structures. Currently, I am doing SQL
queries and returning an array ref and a hash ref. The array is to
preserve order, and the hash contains various bits of data about that
particular product ( it could be another hash ref, and often is). After
getting the two references, I use foreach to loop through the array, and
within that loop I access various data from the hash where the productID
is the key. It looks like this:

Common.pm:
sub getdata {
my $AR = [123, 456, 243, ... ]
my $HR = { 123 = {foo=bar, name='name', price='123'}, ... }
return ($AR, $HR);
}

Otherstuff.pm:
my ($AR, $HR) = $self-getdata();
foreach (@{$AR}) {
my $name = $HR-{$_}{name};
...
}

I would like to return a single data structure, but order IS important
(hence the current setup). I was thinking of using an array, where each
element is a hash reference. So I would return something like this:

[ {ID=123, name='name123', foor='bar'},  {ID=321, name='name321',
bar='foo'}, ... ]

Are there any de-referenceing issues (performance wise) that would make
this less efficient than the 2 structures? TIA for any pointers.

--
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Eric Cholet


 Using tied hashes, you could conceivably make your own ordered hash class
 and use that as the data structure you return. You'd still basically have
 two data structures (for performance) but the fact that it is two data
 structures would be hidden behind the tied hash which would be programmed
 to iterate the keys using the array rather than the keys function on the
 hash part.

 I think there is source code for this publicly available, but I forget
 where I saw it. You can get some docs from perldoc perltie though.


Tie::IxHash

--
Eric

Re: Data structure question

2000-06-06 Thread Ken Miller


At 12:39 PM 6/6/00 -0400, Drew Taylor wrote:
Hello,

This doesn't directly relate to mod_perl, but I'd like to make this as
memory efficient as possible since it runs under mod_perl. :-) 

I have a question about data structures. Currently, I am doing SQL
queries and returning an array ref and a hash ref. The array is to
preserve order, and the hash contains various bits of data about that
particular product ( it could be another hash ref, and often is). After
getting the two references, I use foreach to loop through the array, and
within that loop I access various data from the hash where the productID
is the key. It looks like this:

Common.pm:
sub getdata {
   my $AR = [123, 456, 243, ... ]
   my $HR = { 123 = {foo=bar, name='name', price='123'}, ... }
   return ($AR, $HR);
}

Otherstuff.pm:
my ($AR, $HR) = $self-getdata();
foreach (@{$AR}) {
   my $name = $HR-{$_}{name};
   ...
}

I would like to return a single data structure, but order IS important
(hence the current setup). I was thinking of using an array, where each
element is a hash reference. So I would return something like this:

[ {ID=123, name='name123', foor='bar'},  {ID=321, name='name321',
bar='foo'}, ... ]

Well, if the keys are unique, you could just return a hashref, and then
access it using sorted keys:

foreach( sort keys %$HR ) {
## insert useful stuff here
}

Are there any de-referenceing issues (performance wise) that would make
this less efficient than the 2 structures? TIA for any pointers.

Probably not, except your method takes more mems, since you're returning an
extra array.  'Course, the sort takes mems as well, but not as much as the
extra array.

And, there is the overhead of sorting the keys.

I think an array of hashref's is probably the best bet.  Then you can use
the DBs sort, and just build the array on the fly, once.

For the site I'm working on, I return a reference to a ResultSet object
which through the next() method returns the next row in the result set:

my $account = $dbs-get_account( "123456789" );
my $rs = $account-get_cards;
while( my $unit = $rs-next ) {
# do something
}

so this enforces the order (due to the order by in the SQL query).  This is
a bit slower, since next() returns allocated objects, but it works.


Cheers!

-klm.

---
Ken Miller, Consultant
Shetland Software Services Inc.

Re: Data structure question

2000-06-06 Thread Stas Bekman


On Tue, 6 Jun 2000, Eric Cholet wrote:

  Using tied hashes, you could conceivably make your own ordered hash class
  and use that as the data structure you return. You'd still basically have
  two data structures (for performance) but the fact that it is two data
  structures would be hidden behind the tied hash which would be programmed
  to iterate the keys using the array rather than the keys function on the
  hash part.
 
  I think there is source code for this publicly available, but I forget
  where I saw it. You can get some docs from perldoc perltie though.
 
 
 Tie::IxHash

and in perl5.6 it's called pseudohash (well it was known before but is
supported in 5.6) 
http://www.perl.com/pub/doc/manual/html/pod/perldelta.html#Pseudo_hashes_are_supported

also take a look at this:
Building a Better Hash
http://www.dfan.org/real/tpj_hash.html


_
Stas Bekman  JAm_pH --   Just Another mod_perl Hacker
http://stason.org/   mod_perl Guide  http://perl.apache.org/guide 
mailto:[EMAIL PROTECTED]   http://perl.org http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: Data structure question

2000-06-06 Thread Drew Taylor


"Ken Y. Clark" wrote:
 
 On Tue, 6 Jun 2000, Drew Taylor wrote:
 
  I have a question about data structures. Currently, I am doing SQL
  queries and returning an array ref and a hash ref. The array is to
  preserve order, and the hash contains various bits of data about that
 
 not to be dense, but can't you just issue an "order by" clause as part of
 your SQL statement?
I do that already. But hashes don't preserve order on their own,
otherwise I'd just use that alone. The array keeps things in the proper
order, and the hash stores data about each individual record.

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Drew Taylor


Gunther Birznieks wrote:
 
 Using tied hashes, you could conceivably make your own ordered hash class
 and use that as the data structure you return. You'd still basically have
 two data structures (for performance) but the fact that it is two data
 structures would be hidden behind the tied hash which would be programmed
 to iterate the keys using the array rather than the keys function on the
 hash part.
 
 I think there is source code for this publicly available, but I forget
 where I saw it. You can get some docs from perldoc perltie though.
I know about tied hashes - Thanks Damien for your excellent book! - but
there is a performance penalty. How big is this penalty? Is it worth
using tied hashes? Versus an array of hash refs?

 At 12:39 PM 6/6/00 -0400, Drew Taylor wrote:
 Hello,
 
 This doesn't directly relate to mod_perl, but I'd like to make this as
 memory efficient as possible since it runs under mod_perl. :-)
 
 I have a question about data structures. Currently, I am doing SQL
 queries and returning an array ref and a hash ref. The array is to
 preserve order, and the hash contains various bits of data about that
 particular product ( it could be another hash ref, and often is). After
 getting the two references, I use foreach to loop through the array, and
 within that loop I access various data from the hash where the productID
 is the key. It looks like this:
 
 Common.pm:
 sub getdata {
 my $AR = [123, 456, 243, ... ]
 my $HR = { 123 = {foo=bar, name='name', price='123'}, ... }
 return ($AR, $HR);
 }
 
 Otherstuff.pm:
 my ($AR, $HR) = $self-getdata();
 foreach (@{$AR}) {
 my $name = $HR-{$_}{name};
 ...
 }
 
 I would like to return a single data structure, but order IS important
 (hence the current setup). I was thinking of using an array, where each
 element is a hash reference. So I would return something like this:
 
 [ {ID=123, name='name123', foor='bar'},  {ID=321, name='name321',
 bar='foo'}, ... ]
 
 Are there any de-referenceing issues (performance wise) that would make
 this less efficient than the 2 structures? TIA for any pointers.
 
 --
 Drew Taylor
 Vialogix Communications, Inc.
 501 N. College Street
 Charlotte, NC 28202
 704 370 0550
 http://www.vialogix.com/

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Drew Taylor


Eric Cholet wrote:
 
  Using tied hashes, you could conceivably make your own ordered hash class
  and use that as the data structure you return. You'd still basically have
  two data structures (for performance) but the fact that it is two data
  structures would be hidden behind the tied hash which would be programmed
  to iterate the keys using the array rather than the keys function on the
  hash part.
 
  I think there is source code for this publicly available, but I forget
  where I saw it. You can get some docs from perldoc perltie though.
 
 Tie::IxHash
How much overhead does this module impose? I've heard about it in my
readings, but never looked into it very much. Does anyone have
experience using Tie::IxHash?

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Nick Tonkin


On Tue, 6 Jun 2000, Drew Taylor wrote:

 "Ken Y. Clark" wrote:
  
  On Tue, 6 Jun 2000, Drew Taylor wrote:
  
   I have a question about data structures. Currently, I am doing SQL
   queries and returning an array ref and a hash ref. The array is to
   preserve order, and the hash contains various bits of data about that
  
  not to be dense, but can't you just issue an "order by" clause as part of
  your SQL statement?
 I do that already. But hashes don't preserve order on their own,
 otherwise I'd just use that alone. The array keeps things in the proper
 order, and the hash stores data about each individual record.


doh, but if you fetch the data using ORDER BY and then read in one hashref
at a time you are getting an ordered array, not a hash ...

this is way OT now ...

 
 -- 
 Drew Taylor
 Vialogix Communications, Inc.
 501 N. College Street
 Charlotte, NC 28202
 704 370 0550
 http://www.vialogix.com/
 


- nick

Re: Data structure question

2000-06-06 Thread Drew Taylor


Ken Miller wrote:
 
 Well, if the keys are unique, you could just return a hashref, and then
 access it using sorted keys:
 
 foreach( sort keys %$HR ) {
 ## insert useful stuff here
 }
If only I could just use sort. :-) The order could be completely
arbitrary, based on search parameters, individual rankings, etc.


 Are there any de-referenceing issues (performance wise) that would make
 this less efficient than the 2 structures? TIA for any pointers.
My guess was that whatever overhead there was with de-referencing, it
would more than make up for it in the memory usage. And since I'm
running mod_perl with perl's malloc(), the extra memory doesn't get
returned until the child exists.

 Probably not, except your method takes more mems, since you're returning an
 extra array.  'Course, the sort takes mems as well, but not as much as the
 extra array.
 
 And, there is the overhead of sorting the keys.


 I think an array of hashref's is probably the best bet.  Then you can use
 the DBs sort, and just build the array on the fly, once.
 
 For the site I'm working on, I return a reference to a ResultSet object
 which through the next() method returns the next row in the result set:
That is a very neat idea. From a logical point of view, I like it.
However, in my case that would be unnecessary overkill. I'll file it
away for future use. :-)

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

RE: Data structure question

2000-06-06 Thread Jerrad Pierce


you can use sort, of the values are hashes or indexes:

foreach ( sort { $a-{name} cmp $b-{name} keys %hash )
or
foreach ( sort { $a-[0] cmp $b-[0] keys %hash )

  o _
 /|/ |   Jerrad Pierce \ | __|_ _|
 /||/   http://pthbb.org  .  | _|   |
 \||  _.-~-._.-~-._.-~-._@"  _|\_|___|___|


 -Original Message-
 From: Drew Taylor [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, June 06, 2000 14:56
 To: Ken Miller
 Cc: modperl
 Subject: Re: Data structure question
 
 
 Ken Miller wrote:
  
  Well, if the keys are unique, you could just return a 
 hashref, and then
  access it using sorted keys:
  
  foreach( sort keys %$HR ) {
  ## insert useful stuff here
  }
 If only I could just use sort. :-) The order could be completely
 arbitrary, based on search parameters, individual rankings, etc.
 
 
  Are there any de-referenceing issues (performance wise) 
 that would make
  this less efficient than the 2 structures? TIA for any pointers.
 My guess was that whatever overhead there was with de-referencing, it
 would more than make up for it in the memory usage. And since I'm
 running mod_perl with perl's malloc(), the extra memory doesn't get
 returned until the child exists.
 
  Probably not, except your method takes more mems, since 
 you're returning an
  extra array.  'Course, the sort takes mems as well, but not 
 as much as the
  extra array.
  
  And, there is the overhead of sorting the keys.
 
 
  I think an array of hashref's is probably the best bet.  
 Then you can use
  the DBs sort, and just build the array on the fly, once.
  
  For the site I'm working on, I return a reference to a 
 ResultSet object
  which through the next() method returns the next row in the 
 result set:
 That is a very neat idea. From a logical point of view, I like it.
 However, in my case that would be unnecessary overkill. I'll file it
 away for future use. :-)
 
 -- 
 Drew Taylor
 Vialogix Communications, Inc.
 501 N. College Street
 Charlotte, NC 28202
 704 370 0550
 http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Drew Taylor


Stas Bekman wrote:
 
 and in perl5.6 it's called pseudohash (well it was known before but is
 supported in 5.6)
 
http://www.perl.com/pub/doc/manual/html/pod/perldelta.html#Pseudo_hashes_are_supported
I know about pseudohashes - thanks to Damien again! :-). They look very
cool, but to be honest I'm afraid to implement them yet until I've had
time to play with them more.

 also take a look at this:
 Building a Better Hash
 http://www.dfan.org/real/tpj_hash.html
I'll definately take a look at this article.

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Drew Taylor


Jerrad Pierce wrote:
 
 you can use sort, of the values are hashes or indexes:
 
 foreach ( sort { $a-{name} cmp $b-{name} keys %hash )
 or
 foreach ( sort { $a-[0] cmp $b-[0] keys %hash )
In this case I can't use sort since the order is completely arbitrary,
based on the SQL issued. Hence the need for the array. ;-)

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Perrin Harkins


On Tue, 6 Jun 2000, Drew Taylor wrote:
 I know about tied hashes - Thanks Damien for your excellent book! - but
 there is a performance penalty. How big is this penalty? Is it worth
 using tied hashes? Versus an array of hash refs?

They're a lot slower than normal data structures, or even normal object
methods.  Whether that slowness will be noticeable next to the slowness of
accessing a database is questionable.  There were a few benchmarks posted
to this list that you could dig out of the archive.

- Perrin

Re: Data structure question

2000-06-06 Thread Stas Bekman


On Tue, 6 Jun 2000, Perrin Harkins wrote:

 On Tue, 6 Jun 2000, Drew Taylor wrote:
  I know about tied hashes - Thanks Damien for your excellent book! - but
  there is a performance penalty. How big is this penalty? Is it worth
  using tied hashes? Versus an array of hash refs?
 
 They're a lot slower than normal data structures, or even normal object
 methods.  Whether that slowness will be noticeable next to the slowness of
 accessing a database is questionable.  There were a few benchmarks posted
 to this list that you could dig out of the archive.

If you are going to run it with apache benchmarks try a fresh version of
Apache::Benchmark
http://stason.org/works/modules/Apache-Benchmark-0.01.tar.gz

but actually there is no reason, Benchmark is perfect for that... I have
posted a few examples, so you can roll your own benchmark. 
  perldoc Benchmark
will be no less helpful

... just plugged this note about Apache::Benchmark so you'd go grab and
try the package before I release it...

_
Stas Bekman  JAm_pH --   Just Another mod_perl Hacker
http://stason.org/   mod_perl Guide  http://perl.apache.org/guide 
mailto:[EMAIL PROTECTED]   http://perl.org http://stason.org/TULARC
http://singlesheaven.com http://perlmonth.com http://sourcegarden.org

Re: Data structure question

2000-06-06 Thread darren chamberlain


Hi Drew,

How about writing a custom sort routine, based on the order you would be
using in the array, and returning that as a code ref? Sorting the hash
would be as simple as:

Common.pm: 
   
sub getdata {  
   
   my $CR = sub { # generate code ref here };
   my $HR = { 'sort_sub' = $CR, 123 = {foo=bar, name='name', price='123'}};
   return ($HR);   
  
}  
   
   
   
Otherstuff.pm: 
   
my ($HR) = $self-getdata();   
  
foreach (sort {$HR-{'sort_sub'}} grep !/sort_sub/ keys %{$HR}) { 
   
   my $name = $HR-{$_}{name}; 
   
   ... 
   
}

I'm assuming here that $AR's ordering may change based on a db select
or something similar; if it doesn't, then write it as a regular subroutine.

If sorting that way is really not an option, then just make $AR an array ref
as you had it, make it an element of $HR called 'sort', and change the foreach
Otherstuff.pm to read something like:

foreach (grep !/sort/ @{$HR-{'sort'}}) {
   # la la la
}

darren

-- 
One man's "magic" is another man's engineering.  "Supernatural" is a null word.
-- Robert Heinlein

Re: Data structure question

2000-06-06 Thread Drew Taylor


darren chamberlain wrote:
 
 Hi Drew,
 
 How about writing a custom sort routine, based on the order you would be
 using in the array, and returning that as a code ref? Sorting the hash
 would be as simple as:
In this case, it's overkill: the DB has already put the data together in
the order I need it. I just have to transfer that structure/order to the
rest of my code.

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

Re: Data structure question

2000-06-06 Thread Drew Taylor


Stas Bekman wrote:
 
 On Tue, 6 Jun 2000, Perrin Harkins wrote:
 
  On Tue, 6 Jun 2000, Drew Taylor wrote:
   I know about tied hashes - Thanks Damien for your excellent book! - but
   there is a performance penalty. How big is this penalty? Is it worth
   using tied hashes? Versus an array of hash refs?
 
  They're a lot slower than normal data structures, or even normal object
  methods.  Whether that slowness will be noticeable next to the slowness of
  accessing a database is questionable.  There were a few benchmarks posted
  to this list that you could dig out of the archive.
I knew they were slower. I'll look for the benchmarks in the archives.
Unless I find something really cool that justifies tie()ing, I'm just
going to go withmy original idea. The DB already gives me the order I
want - I just need to transfer it. But I really like the ideas behind
tie'ing things. You can do some really neat stuff behind the scenes. :-)

 If you are going to run it with apache benchmarks try a fresh version of
 Apache::Benchmark
 http://stason.org/works/modules/Apache-Benchmark-0.01.tar.gz
 
 ... just plugged this note about Apache::Benchmark so you'd go grab and
 try the package before I release it...
Nothing like a good plug. ;-)

-- 
Drew Taylor
Vialogix Communications, Inc.
501 N. College Street
Charlotte, NC 28202
704 370 0550
http://www.vialogix.com/

37 matches

Mail list logo