I cannot speak for Kostas, but I often mix text and numbers. Consider an example, where I have a list of text identifiers and corresponding numbers, e.g., p-values. Now I want to find identifiers with p < 0.05. If $id and $p were piddles with text identifiers and p-vales, respectively, I could do
my $sel = which($p < 0.05); wcols $id($sel); to print identifiers corresponding to me selection. Alas, I cannot do this, so I have to do what Bryan does: employ a hash and an array for converting between strings and integer ids, and then, in order to print selected identifiers I'd have to use a loop. Much less elegant. It would be nice to have (variable-length) strings implemented in PDL. Marek On Fri, 24 Jul 2015 at 14:25 David Mertens <[email protected]> wrote: > Hello Kostas, > > To follow up on what Bryan said, I wonder what sort of PDL functionality > you hope to use with a piddle of words, as opposed to a normal Perl array. > I have a hard time imagining you'll need the multidimensional handling PDL > provides. Even if you want a list of lists, PDL will only work with a > collection of lists that have identical length. A Perl list of lists can > accommodate variable length lists, and those lists can accommodate strings > of variable length. Perl's map and grep are pretty flexible and fast, too. > > One the other hand, if you're doing computational linguistics, the typical > approach I've seen is to map all words to integers and analyze the > collections of integers. You can build a hash lookup table to map from the > words to the integers, and a regular Perl array of the words themselves can > map integer offsets to the original words. > > Of course, I could be wrong. What is the actual problem you are trying to > solve? > > David > > On Fri, Jul 24, 2015 at 9:10 AM, Bryan Jurish <[email protected]> > wrote: > >> moin Konstantinos, >> >> afaik, builtin support only includes PDL::Char, which is restricted >> fixed-length strings encoded as byte-values (e.g. ASCII). There's also >> Zakariyya >> Mughal's Data::Frame which seems capable of handling variable-length >> strings, but I'm unclear on the details; perhaps he can chime in. Whenever >> I need to do something like this (very often, since I work with text data), >> I usually end up building an extra hash+array pair for mapping back and >> forth between strings and integer-IDs, and let PDL work with just the IDs. >> Not pretty, but it works. >> >> marmosets, >> Bryan >> >> On Fri, Jul 24, 2015 at 2:38 PM, Konstantinos Billis <[email protected]> >> wrote: >> >>> Hi people, >>> >>> >>> Just a quick question. I am using PDL to build arrays, for example " >>> zeroes" function. If I understand correctly, those elements of the >>> arrays should contain only numbers (or bad, inf etc). Could I use any other >>> function for creating strings/words of lists/arrays instead of numbers? In >>> other words, for example, to initialize an array with NULLs and then add >>> strings or words in particular positions of the array. >>> >>> >>> Many Thanks, >>> Kostas >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> pdl-general mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/pdl-general >>> >>> >> >> >> -- >> Bryan Jurish "There is *always* one more bug." >> [email protected] -Lubarsky's Law of Cybernetic Entomology >> >> >> ------------------------------------------------------------------------------ >> >> _______________________________________________ >> pdl-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/pdl-general >> >> > > > -- > "Debugging is twice as hard as writing the code in the first place. > Therefore, if you write the code as cleverly as possible, you are, > by definition, not smart enough to debug it." -- Brian Kernighan > > ------------------------------------------------------------------------------ > _______________________________________________ > pdl-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/pdl-general > -- Dr Marek Gierliński Data Analyst The Data Analysis Group The Barton Group Division of Computational Biology and GRE College of Life Sciences University of Dundee, Dundee, Scotland, UK. Tel:+44 1382 386427 www.compbio.dundee.ac.uk/dag.html
------------------------------------------------------------------------------
_______________________________________________ pdl-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/pdl-general
