On 2/2/07, Gauthier, Dave <[EMAIL PROTECTED]> wrote:
Getting unwanted list elements when using split with regex.  Here's an
example....



$str = "abc=In";

@arr = split(/[a-zA-Z0-9]/,$str);
[snip]
If I change "abc=In" to "abcdef=In", I get 6 unwanetd null elements (one
per char before the "=").

I was expectiing a single element list with arr[0] = "=".



What's Up ?    Is ther a clen way to prevent creating these unwanted
elements?

Dave,

What's up is that split splits on each character, just like you asked
it to: it returns a list of all the things between pairs of
[a-zA-Z0-9]. In your case, that's four empty strings and an '='.

That seem silly in your case, but how should aplit know when it's
silly and when it's not? The typical use of split is to deal with some
sort of tabular or delimited data. A common example is an /etc/passwd
file:

   smith:*:100:100:8A-74(office):/home/smith:/usr/bin/sh
   guest:*:200:0::/home/guest:/usr/bin/sh

if you're iterating through the file with

   my @line = split /:/;

you always want the home directory to be in $line[5]. If split
silently ignored empty fields, though, by treating consecutive
delimiters as a single delimiter, you'd never know the index of the
field you were looking for in any given line. Sometimes it might be
[5], sometimes it might be [4]. The same problem you arise with CSV,
tab delimited files, or any of the other data structures split is
normally used to parse.

If you want to treat consecutive delimiters as single delimiters, add
a '+' to your regex, as Tom suggested.

In fact, though, you seem to be doing exactly the opposite of what
split is designed for. If you're not actually splitting strings into
delimited fields, but instead extracting substrings that meet certain
criteria, you might want to consider using a match instead, something
like:

   @matches = $line =~ m/\w+(\W+)(?=\w)/g;    # or /\w+(\W+)/, depending

HTH,

-- jay
--------------------------------------------------
This email and attachment(s): [  ] blogable; [ x ] ask first; [  ]
private and confidential

daggerquill [at] gmail [dot] com
http://www.tuaw.com  http://www.downloadsquad.com  http://www.engatiki.org

values of β will give rise to dom!

Reply via email to