"Nilay Puri, Noida" <[EMAIL PROTECTED]> wrote: >Can any one walk me thru this piece of code :: > >while(<STDIN>) >{ > chomp ; > $isbn =(split(/^_/, $_))[0] ; --- not able to understand what is >being accessed (......)[0] > unless ($KEYS{$isbn} ) ---- isbn is a scalar variable, how keys >wok on it ? > { > print "$_\n" ; > $KEYS{$isbn} =1 ; > } >}
I'm not sure what the intent of the code is, but I would guess you're parsing a set of lines from a file, each line containing an ISBN and some other data, to extract all the unique ISBNs from it, and print those numbers without repeating duplicates, as a side effect leaving you with a hash containing all those unique ISBNs. It would be useful to see some of the data the code is intended to process. Assuming that's the goal, you're not far from it. The (......)[0] says "take the result of the split, which is a list, and get the 0th or first element from it". This works because you can subscript a list the same way you can subscript an array variable. In other words, the expression gets the first thing from the list that results from splitting, which is probably meant to be the first thing on the line, which is probably meant to be the ISBN. The $KEYS{$isbn} expression is a hash access, getting from hash %KEYS the value associated with key $isbn. You're right, 'keys' is a function which is used with hashes, but in this case KEYS is also a variable. The split pattern seems unusual. As written, it says "split the string at the beginning if the line starts with an underscore". The caret character in the pattern will match on the start of the string; the string will consist of an entire line read from the file, put into the variable $_ by the read operation <STDIN>. I've never tried to split on the beginning of the string, so let's write a test script that does that and see what happens. testsplitBOL.pl --------------- use warnings; use strict; my @result; while(<STDIN>){ chomp; print "Processing line: >$_<\n"; @result = split( /^_/, $_); print "++Split resulted in ", scalar(@result), " items.\n"; print "++First element of split is >", $result[0], "<.\n"; } result: ------- D:\MCD\dvl\scripts>type testsplitBOL.txt 1234 some text _5678 some other text _9012_some_other_text_separated_by_underscores _7654 0987 D:\MCD\dvl\scripts>type testsplitBOL.txt | perl testsplitBOL.pl Processing line: >1234 some text< Split resulted in 1 items. First element of split is >1234 some text<. Processing line: >_5678 some other text< Split resulted in 2 items. First element of split is ><. Processing line: >< Split resulted in 0 items. Use of uninitialized value in print at testsplitBOL.pl line 11, <STDIN> line 3. First element of split is ><. Processing line: >_9012_some_other_text_separated_by_underscores< Split resulted in 2 items. First element of split is ><. Processing line: >_7654< Split resulted in 2 items. First element of split is ><. Processing line: >0987< Split resulted in 1 items. First element of split is >0987<. >From these results we can see several things: - Splitting on the beginning of the string, when successful, appears to give you an empty string as the first elem of the resulting list. Your code would take this to be an ISBN and use it as a hash key, which is certainly not correct. - When the split does not match its pattern, it yields a list consisting of a single element, the original string. - The pattern in split only matches lines that begin with underscore. Whether or not that's what you want depends on your data. - The code should have a test to make sure the line is not just an empty string Note that the Perl documentation for split says A PATTERN of /^/ is treated as if it were /^/m, since it isn't much use otherwise but that doesn't seem to apply here, both because your pattern is not /^/ (rather, it is /^_/) and because that doesn't seem to be what's happening in the test results. I'd be glad to help you code up your loop, but we really need to see a sample of data to understand the task. In any event, I think you want something like this: use warnings; use strict; my %KEYS = (); my $isbn; my @result; while(<STDIN>){ chomp; if( /^_/ ){ # select only those lines to split: not empty and start with underscore, or whatever @result = split( /:/, $_); # split on whatever separates ISBN from what follows it on line if( scalar( @result ) > 1 ){ # make sure the split actually split something $isbn = $result[0]; # we assume the ISBN is first thing on the line unless( $KEYS{$isbn} ){ # make sure this ISBN hasn't already been printed before printing it print "$_\n"; $KEYS{$isbn} = 1; } } else { die "Error processing line $.: $_ could not be split.\n"; } } } Or something like that. You could make it more concise, but that's the basic idea. Show us your data! __________________________________________________________________ New! Unlimited Netscape Internet Service. Only $9.95 a month -- Sign up today at http://isp.netscape.com/register Act now to get a personalized email address! Netscape. Just the Net You Need. -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>