On Fri, Oct 1, 2010 at 7:44 AM, David Vrensk <da...@icehouse.se> wrote:
> I would just preprocess the file with Perl or Ruby:
>
> perl -ne 'next unless m#/#; s#(.*)/(.*)#\1\t\2#; print;' infile > outfile

What is the "#" representing? I have a semi-educated guess, but I
can't find that particular symbol in any examples.

Also, as far as I can tell, this regex also misses the top level path
because it has not children. For example, the "Arts" path. It catches
"Arts/Anime" and below nicely, of course.

> Come to think of it, if your entire file is just 800k lines, I'd do the
> entire thing with Perl.

I thought about that when PHP couldn't handle it, but my Perl skills
are light and it was a chance to learn something entirely new.

Thanks for your help.

-- 
+rw
 
The information transmitted in this  
email is intended only for the  
person(s) or entity to which it is  
addressed and may contain  
confidential and/or privileged  
material. Any review,  
retransmission, dissemination  
or other use of, or taking of any  
action in reliance upon, this  
information by persons or entities  
other than the intended recipient  
is prohibited. If you received this  
email in error, please contact the  
sender and permanently delete the  
email from any computer.  

Reply via email to