Redirect Print

2007-05-30 Thread Naji, Khalid
Hi,

is there any way to mask a printing, when I call a Function using the
Command print and without to modify this Function:

Like:

My ($Name) = get_name(12345);

Sub get_name
{
$id = @_;
...
Print Blalalalalalalbla...\n;
Return (Name);
}


Thanks,
KN



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Redirect Print

2007-05-30 Thread Sean Davis
On Wednesday 30 May 2007 11:56, Naji, Khalid wrote:
 Hi,

 is there any way to mask a printing, when I call a Function using the
 Command print and without to modify this Function:

 Like:

 My ($Name) = get_name(12345);

 Sub get_name
 {
 $id = @_;
 ...
 Print Blalalalalalalbla...\n;
 Return (Name);
 }

Printing will go to stdout, which is then sent to the browser.  If you don't 
want that to go to the browser, you can print to STDERR and that will 
typically end up in the error_log file for your server.

Sean

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Redirect Print

2007-05-30 Thread Paul Lalli
On May 30, 11:56 am, [EMAIL PROTECTED] (Khalid Naji) wrote:
 is there any way to mask a printing, when I call a Function using the
 Command print and without to modify this Function:

 Like:

 My ($Name) = get_name(12345);

 Sub get_name
 {
 $id = @_;
 ...
 Print Blalalalalalalbla...\n;
 Return (Name);

 }


You can open a filehandle to the system's /dev/null and then select
that filehandle to make it the default.  Make sure you reselect the
original filehandle when the subroutine is done.  For example:
$ perl -le'
print Start;
open my $devnull, , /dev/null or die $!;
my $old_fh = select $devnull;
mysub();
select $old_fh;
print End;
sub mysub {
   print In mysub;
}
'
Output:
Start
End


Paul Lalli

P.S.  Please don't use a word processor to compose a post to a
technical list/newsgroup.  Or at the very least, turn off the word
processor's auto-capitalization feature.  You make it impossible for
anyone to copy and paste your code...


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




encode UTF8 - MIME

2007-05-30 Thread cc96ai
I got UTF8 value %C3%A9
how could I encode it become é ?

I try encode_base64 , but no luck
maybe I miss some, anyone have idea ?


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Mumia W.

On 05/30/2007 12:40 AM, Laxminarayan G Kamath A wrote:
Hi PERLers, 
	We here at DeepRoot Linux were trying to parse Outlook's csv so

that I can add them to ldap addressbook.. [...]


The Perl module Text::CSV_XS would make your work much simpler, and it 
might execute a little faster.






--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




RE: Array of Array refs

2007-05-30 Thread Andrew Curry
Enough Is enough, can we leave this thread be now. This just puts people off
posting questions looking for help in fear of joining some flame war. 

-Original Message-
From: Brian [mailto:[EMAIL PROTECTED] 
Sent: 29 May 2007 19:30
To: beginners@perl.org; [EMAIL PROTECTED]
Subject: Re: Array of Array refs

On May 29, 6:06 am, [EMAIL PROTECTED] (Paul Lalli) wrote:
 On May 29, 4:58 am, [EMAIL PROTECTED] (Brian) wrote:

  On May 28, 6:14 pm, [EMAIL PROTECTED] (Paul Lalli) wrote:

 oh yes, more important than all that minutiae... the push did 
not work for me in the working code.

   The push worked absolutely fine.  It just didn't do what you 
   wanted it to.  Learning how to parse your problem should be your 
   first step toward becoming a better programmer.

   hmmm, misunderstanding there. The push worked fine in the sample I 
  posted, but not in the more complex working program I had simplified 
  as an example.

 nope, sorry, you're wrong.  push() works perfectly well.  It adds 
 elements to an array.  If your program produced incorrect results, it 
 is because you did something wrong, not because push didn't work.

ugh. pedantic semantics...


 Of
 course, as you haven't shown any code that produces these undesired 
 results, we can only guess as to what your actual problem was.

yes, the real code is beyond what I would think of as beginner
I truly meant to just post a small, working piece pf code that worked with
some basic data structures... The DBI client is a bit more complicated, yes?


The array was being rewritten.
   Then you didn't delcare your variables in the correct scope.
  understandable misperception on your part, as above

 Nope.  If your array is being used in a loop, the contents of that 
 array are changing when you don't want them to be changing, and 
 instead want to be creating new arrays, you declared your array in the 
 wrong scope.


The sample I put out and the code I was
working on are not identical. I declared my array in a different scope in
the sample. Neither is wrong, they are different.

I had to use an array copy

  push @tRespsA, [ @r1 ];   ## copy contents to an anonymous array,
push array ref

   Do you understand *why* that was necessary?  Do you understand the 
   difference between these two pieces of code?

actually, I do indeed. In C++, the concept of a deep copy, vs 
  shallow copy vs ref comes up all the time. I am just learning the 
  syntax here, not programming itself

 perhaps you should be. . .

ugh. insults. I am not trying to insulting you Paul, why resort to that?
You have no idea how many programs I've written, well. I'd say its bad form
to assume the worst or lowest in people.


aha, a force to be reckoned with. Your point above about the docs 
  is quite true.
  I dont have time to rewrite docs right now.. They do need work 
  though

 You don't even have time to point out what you find to be wrong with 
 them?  But instead you do have time to create examples that you claim 
 to be in the service of newbies, all the while saying that the docs 
 are bad?  I'd suggest you could do with attending a few more time- 
 management seminars.


please examine logic - rewriting core Perl docs vs a 10 line sample program

  This is sometimes appearing to be contentious

 No sometimes about it.  Every post you've made thus far is 
 contentious, and so I have answered in kind.


no, I'd suggest you read it that way.. Its notoriously difficult to convey
context and nuance in a few lines of ascii text. Please refer to your RTFM
and then essentially yelling at me for a modest sample program.

I am not backing down from your brow-beating. I posted a small working
sample program then found out more about the context I was dealing with and
attempted to discuss it in an unassuming manner. For this I get these
responses. phooey!

I'll be posting again from time to time, and probably going to OSCon.
Happy
to talk with you anytime. May not respond every time though. I think I am
sensing a pattern here...

best regards
  -Brian




--
To unsubscribe, e-mail: [EMAIL PROTECTED] For additional
commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/



This e-mail is from the PA Group.  For more information, see
www.thepagroup.com.

This e-mail may contain confidential information.  Only the addressee is
permitted to read, copy, distribute or otherwise use this email or any
attachments.  If you have received it in error, please contact the sender
immediately.  Any opinion expressed in this e-mail is personal to the sender
and may not reflect the opinion of the PA Group.

Any e-mail reply to this address may be subject to interception or
monitoring for operational reasons or for lawful business practices.





-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Laxminarayan G Kamath A
On Wed, 30 May 2007 01:26:30 -0500, Mumia W. mumia.w.18.spam
[EMAIL PROTECTED] wrote:


 The Perl module Text::CSV_XS would make your work much simpler, and
 it might execute a little faster.

Thank you for pointing out .. but we have already tried it!
Unfortunately, it failed to seperate the records in the right fashion.
We have also tried the several more modules from CPAN.. and they were
not able to parse the OutLook's CSV. 

If you read my mail again, you might find that I already mentioned that
we tried several modules before falling back to writing our own code.

What I am expecting is help with the variant of the regex I used as the
condition for while loop. I am sure If we modify that regexp a little
bit, then we can just use it on the record like this :

$_ = $record;
@fields = /regexp/g;

I tried a lot of different ways but just could not get the right
regexp :-(. 

-- 
Cheers,
Laxminarayan G Kamath A
e-mail: [EMAIL PROTECTED]
Work URL: http://deeproot.in

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Dr.Ruud
Laxminarayan G Kamath A schreef:

 The stubling blocks : there are several types of problems in
 Outlook's CSV ..

You forgot to supply a link to such a file. Or show a __DATA__ section
for testing.

-- 
Affijn, Ruud

Gewoon is een tijger.


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: encode UTF8 - MIME

2007-05-30 Thread Mumia W.

On 05/29/2007 07:00 PM, cc96ai wrote:

I got UTF8 value %C3%A9
how could I encode it become é ?

I try encode_base64 , but no luck
maybe I miss some, anyone have idea ?




You need to provide more detail about your problem.



--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Mumia W.

On 05/30/2007 03:04 AM, Laxminarayan G Kamath A wrote:

[...]
I tried a lot of different ways but just could not get the right
regexp :-(. 



I reiterate what the eminent Dr. Ruud said. I need some data to play 
with before I play with the code you posted.





--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Ken Foskey
On Wed, 2007-05-30 at 13:34 +0530, Laxminarayan G Kamath A wrote:

 What I am expecting is help with the variant of the regex I used as the
 condition for while loop. I am sure If we modify that regexp a little
 bit, then we can just use it on the record like this :
 
 $_ = $record;
 @fields = /regexp/g;
 
 I tried a lot of different ways but just could not get the right
 regexp :-(. 

CSV is a horrible format.  Far too unreliable,  we have exported CSV
from excel that imported differently into excel.

Is there another option,  eg connecting to Outlook via a remote
connection?

Is there another format available?

I doubt a simple regex will do it if the CSV modules do not work.

What data do you have problems with?  Without samples there is nothing
we can do.


-- 
Ken Foskey
FOSS developer


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




zero width lookahead match

2007-05-30 Thread Sharan Basappa

Hi All,

I have some background working with scanners built from Flex. And I have
used lookahead capability of flex many a times. But I dont understand the
meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

For example, to capture overlapping 3 digit patterns from string $str =
123456
I use the regex @store = $str =~ m/(?=(\d\d\d))/g;
So here the regex engine actually looks ahead by chars digits.

The other question I have is - how does regex engine decide that it has to
move further its scanner by 1 character everytime since I get output 123 234
345 456
when I run this script ?

Regards,
Sharan


Re: zero width lookahead match

2007-05-30 Thread Chas Owens

On 5/30/07, Sharan Basappa [EMAIL PROTECTED] wrote:

Hi All,

I have some background working with scanners built from Flex. And I have
used lookahead capability of flex many a times. But I dont understand the
meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

snip

I don't know jack about flex, so I can't help you with a comparison, but

snip

The other question I have is - how does regex engine decide that it has to
move further its scanner by 1 character everytime

snip

this is what the zero-width lookahead assertion means.  It say with
out moving where you are currently starting the match, make certain
you can match the following pattern.  If you want it to move where the
match starts then you have to include something that does not have
zero-width like this

#match groups of three characters followed by three characters: 123 and 456
@store = $str =~ m/(\d\d\d)(?=\d\d\d)/g;

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: zero width lookahead match

2007-05-30 Thread Chas Owens

On 5/30/07, Sharan Basappa [EMAIL PROTECTED] wrote:

Hi All,

I have some background working with scanners built from Flex. And I have
used lookahead capability of flex many a times. But I dont understand the
meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

snip

You may also prefer to use the Parse::RecDescent module.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: zero width lookahead match

2007-05-30 Thread Sharan Basappa

this is what the zero-width lookahead assertion means.  It say with
out moving where you are currently starting the match, make certain
you can match the following pattern.  If you want it to move where the
match starts then you have to include something that does not have
zero-width like this



#match groups of three characters followed by three characters: 123 and

456

@store = $str =~ m/(\d\d\d)(?=\d\d\d)/g;


You mention that if I write a rule like @store = $str =~ m/((?=\d\d\d))/g;
then the scanner does not move ahead. But as I mentioned in my mail,
the result of this regex is 123 234 etc. This clearly shows that after every
match,
the regex engine of perl is moving its pointer to next char in the string (
i.e. it starts
looking at 23456 once 123 is matched)
This was exactly my question.

Regarding the other question about comparing with Flex, actually there is
no need to compare with flex. What I was trying to understand is, why is
that
it is called zero lookahead rule when the number of chars it looks ahead
depends
on the rule I write. For example, the regex in the above rule looks ahead 3
chars
ahead to find a match ..

Regards,
Sharan




On 5/30/07, Chas Owens [EMAIL PROTECTED] wrote:


On 5/30/07, Sharan Basappa [EMAIL PROTECTED] wrote:
 Hi All,

 I have some background working with scanners built from Flex. And I have
 used lookahead capability of flex many a times. But I dont understand
the
 meaning of ZERO in zero lookahead match rule i.e. (?=pattern)
snip

I don't know jack about flex, so I can't help you with a comparison, but

snip
 The other question I have is - how does regex engine decide that it has
to
 move further its scanner by 1 character everytime
snip

this is what the zero-width lookahead assertion means.  It say with
out moving where you are currently starting the match, make certain
you can match the following pattern.  If you want it to move where the
match starts then you have to include something that does not have
zero-width like this

#match groups of three characters followed by three characters: 123 and
456
@store = $str =~ m/(\d\d\d)(?=\d\d\d)/g;



Re: zero width lookahead match

2007-05-30 Thread Chas Owens

On 5/30/07, Sharan Basappa [EMAIL PROTECTED] wrote:

 this is what the zero-width lookahead assertion means.  It say with
out moving where you are currently starting the match, make certain
you can match the following pattern.  If you want it to move where the
match starts then you have to include something that does not have
zero-width like this

 #match groups of three characters followed by three characters: 123 and
456
 @store = $str =~ m/(\d\d\d)(?=\d\d\d)/g;

You mention that if I write a rule like @store = $str =~ m/((?=\d\d\d))/g;
then the scanner does not move ahead. But as I mentioned in my mail,
the result of this regex is 123 234 etc. This clearly shows that after every
match,
the regex engine of perl is moving its pointer to next char in the string
(i.e. it starts
looking at 23456 once 123 is matched)
This was exactly my question.

snip

Because it always moves ahead by either one character or the match,
but zero-width constructs do not consume any characters.  That is why
they are called zero-width.

snip

Regarding the other question about comparing with Flex, actually there is
no need to compare with flex. What I was trying to understand is, why is
that
it is called zero lookahead rule when the number of chars it looks ahead
depends
on the rule I write. For example, the regex in the above rule looks ahead 3
chars
ahead to find a match ..

snip

Because it is not called zero lookahead, it is called zero-width
positive lookahead assertion, that is it consumes zero characters from
the string while at the same time causing the match to fail if the
assertion does not match.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: zero width lookahead match

2007-05-30 Thread Rob Dixon

Sharan Basappa wrote:


Hi All,

I have some background working with scanners built from Flex. And I have
used lookahead capability of flex many a times. But I dont understand the
meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

For example, to capture overlapping 3 digit patterns from string $str =
123456
I use the regex @store = $str =~ m/(?=(\d\d\d))/g;
So here the regex engine actually looks ahead by chars digits.


As far as lookahead expressions are concerned, Perl functions identically to
Flex. It is called zero-width lookahead because it matches a zero-width
/position/ in the string instead of a sequence of characters. If I write

'123456' =~ /\d\d\d(...)/

then '456' will be captured as the first three characters were consumed by the
preceding pattern. However if I write

'123456' =~ /(?=\d\d\d)(...)/

then '123' will be captured instead because the lookahead pattern has zero 
width.


The other question I have is - how does regex engine decide that it has to
move further its scanner by 1 character everytime since I get output 123 
234

345 456
when I run this script ?


The engine moves as far through your target string as it needs to to find a new
match. If I write

'1B3D5F' =~ /(?=(.\d.))/g;

then the engine will find a match at only every second character, and if I use
a much simpler zero-width match, just

'ABCDEF' =~ //g

then the regex will match seven times - at the beginning and end and between
every pair of characters - so the more complex zero-width match you have written
will match at all of the those places as long as there are three digits 
following.

HTH,

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: if (FH) VS while (FH)

2007-05-30 Thread jeevs
On May 29, 7:39 pm, [EMAIL PROTECTED] (Jeevs) wrote:
 Yeah i tested it and it works manuaaly (dats the reason i used the
 word automatically in my previous post)
 Was wondering why dat dosnt work 
 Thanks TOM for the reply



Thanks Paul
That really helped..



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: zero width lookahead match

2007-05-30 Thread Sharan Basappa

Thanks Rob and Chas ..

On 5/30/07, Rob Dixon [EMAIL PROTECTED] wrote:


Sharan Basappa wrote:

 Hi All,

 I have some background working with scanners built from Flex. And I have
 used lookahead capability of flex many a times. But I dont understand
the
 meaning of ZERO in zero lookahead match rule i.e. (?=pattern)

 For example, to capture overlapping 3 digit patterns from string $str =
 123456
 I use the regex @store = $str =~ m/(?=(\d\d\d))/g;
 So here the regex engine actually looks ahead by chars digits.

As far as lookahead expressions are concerned, Perl functions identically
to
Flex. It is called zero-width lookahead because it matches a zero-width
/position/ in the string instead of a sequence of characters. If I write

'123456' =~ /\d\d\d(...)/

then '456' will be captured as the first three characters were consumed by
the
preceding pattern. However if I write

'123456' =~ /(?=\d\d\d)(...)/

then '123' will be captured instead because the lookahead pattern has zero
width.

 The other question I have is - how does regex engine decide that it has
to
 move further its scanner by 1 character everytime since I get output 123
 234
 345 456
 when I run this script ?

The engine moves as far through your target string as it needs to to find
a new
match. If I write

'1B3D5F' =~ /(?=(.\d.))/g;

then the engine will find a match at only every second character, and if I
use
a much simpler zero-width match, just

'ABCDEF' =~ //g

then the regex will match seven times - at the beginning and end and
between
every pair of characters - so the more complex zero-width match you have
written
will match at all of the those places as long as there are three digits
following.

HTH,

Rob




Re: zero width lookahead match

2007-05-30 Thread Paul Lalli
On May 30, 10:02 am, [EMAIL PROTECTED] (Chas Owens) wrote:
 On 5/30/07, Sharan Basappa [EMAIL PROTECTED] wrote:

  You mention that if I write a rule like @store = $str =~ m/((?=\d\d\d))/g;
  then the scanner does not move ahead. But as I mentioned in my mail,
  the result of this regex is 123 234 etc. This clearly shows that after every
  match,
  the regex engine of perl is moving its pointer to next char in the string
  (i.e. it starts
  looking at 23456 once 123 is matched)
  This was exactly my question.

 Because it always moves ahead by either one character or the match,
 but zero-width constructs do not consume any characters.  That is why
 they are called zero-width.

I got confused by this too.  I think Sharan's question comes down to
why isn't this an infinite loop?  That is, why does pos() move ahead
one character when it matches 0 characters?  This is not limited to
look-ahead assertions.  The behavior can be seen in other constructs
as well.  For example:

$ perl -wle'
$string = abc;
while ($string =~ /(.*?)/g) {
  print pos($string), : , $1;
}
'
0:
1: a
1:
2: b
2:
3: c
3:

It appears that Perl is actually dividing the string up into
characters and slots between character, and allowing pos() to move
to each of them in sequence.  So at the beginning, it's at the slot
before the first character, and it can successfully match 0
characters.  Then pos() moves to the first character, and the fewest
characters it can find is that one character, so $1 gets 'a'.  Then it
moves to the slot between 'a' and 'b'.  Etc.

Here's another, that doesn't allow any characters to be matched:
$ perl -wle'
$string = abc;
while ($string =~ /(.{0})/g) {
  print pos($string), : , $1;
}
'
0:
1:
2:
3:

Would the above be an accurate description of what's happening?  And
if so, is this behavior documented anywhere?  I couldn't find it in a
cursory examanation of either perlop or perlre...

Thanks,
Paul Lalli


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Chas Owens

On 5/30/07, Laxminarayan G Kamath A [EMAIL PROTECTED] wrote:
snip

Any ways of optimising it further?

snip

Premature optimization is the root of all evil.  Have you profiled the
code yet?  If not then here is some documentation that will point you
in the right direction

http://www.perl.com/pub/a/2004/06/25/profiling.html
http://search.cpan.org/~nwclark/perl-5.8.8/utils/dprofpp.PL

But while I am looking lets see what is going on.

snip

1. One line need not be one record. They may cointain multine
fields.
2. A sigh of relief but : only multi-line fields are wrapped in
double quotes.
3. commas are both inside and outside the fields. the ones in
the fileds must not be treated as seperator - again fields with
commans are wrapped in double quotes.

snip

The following code seems to speed up the parsing by two orders of
magnitude (2.214 seconds for the old code and 0.036 seconds for this
code on 100 records).  Also, there seems to be a bug in your original
code.  I setup a test file with a 100 records of 30 fields each and it
found

found 33 fields in 1 records
found 34 fields in 1 records
found 36 fields in 3 records
found 37 fields in 5 records
found 38 fields in 10 records
found 39 fields in 9 records
found 40 fields in 12 records
found 41 fields in 17 records
found 42 fields in 15 records
found 43 fields in 13 records
found 44 fields in 7 records
found 45 fields in 5 records
found 46 fields in 1 records
found 48 fields in 1 records

===code to generate test file===
#!/usr/bin/perl

use strict;
use warnings;

my $fields= 30;
my $fieldlen  = 30;
my @fieldtype = qw(normal quoted comma);
my $records   = shift;

for my $rec (1 .. $records) {
   for my $field (1 .. $fields) {
   my $type = $fieldtype[int rand @fieldtype];
   if ($type eq 'normal') {
   print 'n' x $fieldlen, ,;
   } elsif ($type eq 'quoted') {
   print '';
   my $i = 0;
   until ($i  $fieldlen) {
   my $len = int rand $fieldlen;
   print 'q' x $len, \n;
   $i += $len;
   }
   print ',';
   } elsif ($type eq 'comma') {
   print '';
   my $i = 0;
   until ($i == $fieldlen) {
   my $len = int rand $fieldlen;
   $len = $fieldlen - $i if $i+$len  $fieldlen;
   print 'c' x ($len/2), ',', 'c' x ($len/2), \n;
   $i += $len;
   }
   print ',';
   }
   }
   print \n;
}

===code to parse test file===
#!/usr/bin/perl

use strict;
use warnings;

my $record = ;
my $quotes = 0;
my @records;
while (defined (my $line = )) {
   next if $record eq  and $line =~ /^\s*$/;

   $record .= $line;

   #count the number of quotes
   $quotes += () = $line =~ //g;

   #if $quotes is even then we have a full record
   if ($quotes % 2 == 0) {
   $quotes = 0;
   chomp $record;
   my @fields;
   my $unbalanced = 0;
   for my $field (split /,/, $record) {
   my $count = $field =~ s///g;
   if ($count % 2) {
   if ($unbalanced) {
   $unbalanced = 0;
   $fields[-1] .= ,$field;
   next;
   }
   $unbalanced = 1;
   push @fields, $field;
   next;
   }
   if ($unbalanced) {
   $fields[-1] .= ,$field;
   } else {
   push @fields, $field;
   }
   }
   push @records, { whole = $record, fields = [EMAIL PROTECTED];
   $record = ;
   }

}

for my $rec (@records) {
   print join |, @{$rec-{fields}},\n===\n;
}

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Outlook CSV Parser

2007-05-30 Thread Chas Owens

On 5/30/07, Ken Foskey [EMAIL PROTECTED] wrote:
snip

CSV is a horrible format.  Far too unreliable,  we have exported CSV
from excel that imported differently into excel.

snip

Just pedantic nitpick, but CSV is an incredibly reliable format, the
problem is find programs that actually use CSV rather than a CSV-like
format.  It works out to the same thing, but it isn't CSV's fault.
For an example of a programmer using a CSV-like format where he/she
should be using the real thing look at my other post on this thread.
My code fails to handle escaped double quotes correctly.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: zero width lookahead match

2007-05-30 Thread Chas Owens

On 30 May 2007 08:53:54 -0700, Paul Lalli [EMAIL PROTECTED] wrote:
snip

I got confused by this too.  I think Sharan's question comes down to
why isn't this an infinite loop?  That is, why does pos() move ahead
one character when it matches 0 characters?  This is not limited to
look-ahead assertions.  The behavior can be seen in other constructs
as well.  For example:

$ perl -wle'
$string = abc;
while ($string =~ /(.*?)/g) {
  print pos($string), : , $1;
}
'
0:
1: a
1:
2: b
2:
3: c
3:


Because /.*?/ matches nothing as well as a, b, and c.  So it matches
nothing, then a, then nothing, then b, then nothing, then c. then
nothing.



It appears that Perl is actually dividing the string up into
characters and slots between character, and allowing pos() to move
to each of them in sequence.  So at the beginning, it's at the slot
before the first character, and it can successfully match 0
characters.  Then pos() moves to the first character, and the fewest
characters it can find is that one character, so $1 gets 'a'.  Then it
moves to the slot between 'a' and 'b'.  Etc.


Yes, otherwise \b wouldn't work very well.

perldoc perlre
   A word boundary (\b) is a spot between two characters that has a \w
   on one side of it and a \W on the other side of it (in either order),
   counting the imaginary characters off the beginning and end of the string
   as matching a \W.

snip

Here's another, that doesn't allow any characters to be matched:
$ perl -wle'
$string = abc;
while ($string =~ /(.{0})/g) {
  print pos($string), : , $1;
}
'
0:
1:
2:
3:

Would the above be an accurate description of what's happening?  And
if so, is this behavior documented anywhere?  I couldn't find it in a
cursory examanation of either perlop or perlre...

snip

You are matching the nothing between the characters.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: encode UTF8 - MIME

2007-05-30 Thread cc96ai
On May 30, 3:51 am, [EMAIL PROTECTED] (Mumia W.)
wrote:
 On 05/29/2007 07:00 PM, cc96ai wrote:

  I got UTF8 value %C3%A9
  how could I encode it become é ?

  I try encode_base64 , but no luck
  maybe I miss some, anyone have idea ?

 You need to provide more detail about your problem.

I have a UTF8 input
$value = %23%C2%A9%C2%AE%C3%98%C2%A5%C2%BC%C3%A9%C3%8B
%C3%B1%C3%A0%C3%A6%3F%23;

the HTML output should be
#©®Ø¥¼éËñàæ?#;

but I cannot find a way to convert it


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: encode UTF8 - MIME

2007-05-30 Thread Chas Owens

On 30 May 2007 06:07:55 -0700, cc96ai [EMAIL PROTECTED] wrote:
snip

I have a UTF8 input
$value = %23%C2%A9%C2%AE%C3%98%C2%A5%C2%BC%C3%A9%C3%8B
%C3%B1%C3%A0%C3%A6%3F%23;

the HTML output should be
#(c)(r)Ø¥¼éËñàæ?#;

but I cannot find a way to convert it

snip

#!/usr/bin/perl
use strict;
use warnings;
use URI::Escape;

my $s = '%C3%A9';

print uri_unescape($s), \n;

This prints
é
for me.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Modules for parsing emails

2007-05-30 Thread Robert Hicks
Since CPAN is pretty huge, I thought I would throw out a request for 
comments.


I need to parse email headers and the body of the email. The emails are 
supposed to be plain text only but sometimes someone goofs and sends one 
in HTML. I need to strip away the HTML elements and/or covert it to 
plain text to process.


What module(s) would you suggest I look at?

Robert

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Modules for parsing emails

2007-05-30 Thread Tom Phoenix

On 5/30/07, Robert Hicks [EMAIL PROTECTED] wrote:


I need to parse email headers and the body of the email. The emails are
supposed to be plain text only but sometimes someone goofs and sends one
in HTML. I need to strip away the HTML elements and/or covert it to
plain text to process.


That reminds me of this:

   http://www.stonehenge.com/merlyn/UnixReview/col37.html

Hope this helps!

--Tom Phoenix
Stonehenge Perl Training

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: Modules for parsing emails

2007-05-30 Thread Robert Hicks

Tom Phoenix wrote:

On 5/30/07, Robert Hicks [EMAIL PROTECTED] wrote:


I need to parse email headers and the body of the email. The emails are
supposed to be plain text only but sometimes someone goofs and sends one
in HTML. I need to strip away the HTML elements and/or covert it to
plain text to process.


That reminds me of this:

   http://www.stonehenge.com/merlyn/UnixReview/col37.html

Hope this helps!



Thanks Tom!

Robert

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/