Hi all,

again i am sending a message to this list hoping you bear with me and my
novice question.

In my script i would like to split a single scalar containing a random
passage of text into a list, which holds the words of the text.

What i found in the perl documentation is the following example [perlrequick - 
The split operator]: 


#For example, to split a string into words, use
#
#     $x = "Calvin and Hobbes";
#         @word = split /\s+/, $x;  # $word[0] = 'Calvin'
                                    # $word[1] = 'and'
                                    # $word[2] = 'Hobbes'

For my case however i would like to have also linefeeds treated as "words",
because i want to preserve them (in order not to mix up the layout of the
text when writing the list of words to an outfile).
Consequently i would think that using like in the example above

 \s = [\x20\f\t\r\n]

 minus

 \n

 would preserve my linefeeds. However when using this regexp i get unwanted 
results.



the input is:
Word a_linefeed_follows_now[linefeed]
Word[linefeed]


test1: use normal \s
--------------------

my @word = split /\s+/, $all_of_it;

foreach (@word) {
        print "This element is: ",  $_, "\n";  }


output:
This element is: Word
This element is: a_linefeed_follows_now
This element is: Word


test2: \s   minus   \n
----------------------

my @word = split /[\x20\f\t\r]+/, $all_of_it;


output:
This element is: Word
This element is: a_linefeed_follows_now
Word


The newline character now will not be regarded as a seperator, however i
now have an element "a_linefeed_follows_now\nWord".
But still i would like to have this split into "a_linefeed_follows_now", "\n", 
"Word".


Can somebody help out with a regexpression or any other idea for this?
Any help is very much appreciated,

-- Tim


((( appendix:

An alternative solution i am also thinking about is:

my @word = split /[\x20\f\t\r]+/, $all_of_it;   # as in test2

foreach(@word) {
        if ($_ =~/\n/) {# so we found an element which still
                        # includes a linefeed character

                $_ = split /\n/, $_
                        # well, what now?, what would a perl newbie be
                        # without problems?
                        #  problem here:
                        # how do you split one element into 2 elements and
                        # how do you merge them then back into the @word array?

-- eof --
__________________________________________________________
Mit WEB.DE FreePhone mit hoechster Qualitaet ab 0 Ct./Min.
weltweit telefonieren! http://freephone.web.de/?mc=021201


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to