Re: [R] Regex matching that gives byte offset?
Hmmm ... that should do it, thanks. But how would one use this on a file without reading it into memory completely? Joh On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote: Do you mean like regexpr() (on the same help page)? Depending on your locale, you might actually prefer the character offset: if you want to match in a MBCS and have byte offsets you will need to work a bit harder if useBytes=TRUE is not sufficient for you. On Wed, 28 Oct 2009, Johannes Graumann wrote: Hi, Is there any way of doing 'grep' ore something like it on the content of a text file and extract the byte positioning of the match in the file? I'm facing the need to access rather largish (600MB) XML files and would like to be able to index them ... Thanks for any help or flogging, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex matching that gives byte offset?
On Mon, 2 Nov 2009, Johannes Graumann wrote: Hmmm ... that should do it, thanks. But how would one use this on a file without reading it into memory completely? ?file, ?readLines, ?readBin will tell you about connections. Joh On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote: Do you mean like regexpr() (on the same help page)? Depending on your locale, you might actually prefer the character offset: if you want to match in a MBCS and have byte offsets you will need to work a bit harder if useBytes=TRUE is not sufficient for you. On Wed, 28 Oct 2009, Johannes Graumann wrote: Hi, Is there any way of doing 'grep' ore something like it on the content of a text file and extract the byte positioning of the match in the file? I'm facing the need to access rather largish (600MB) XML files and would like to be able to index them ... Thanks for any help or flogging, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex matching that gives byte offset?
On Monday 02 November 2009 13:41:45 Prof Brian Ripley wrote: On Mon, 2 Nov 2009, Johannes Graumann wrote: Hmmm ... that should do it, thanks. But how would one use this on a file without reading it into memory completely? ?file, ?readLines, ?readBin will tell you about connections. ... all of which I only get to read by the line and a regexpr on that will not give me the absolute offset. grep -buo on the unix command line is really fast for this. If I can't find the native R equivalent, I'm of a mind to do this via a sys call - ugly and not portable, but SOOO fast ... is it possible in R? Joh Joh On Wednesday 28 October 2009 16:29:00 Prof Brian Ripley wrote: Do you mean like regexpr() (on the same help page)? Depending on your locale, you might actually prefer the character offset: if you want to match in a MBCS and have byte offsets you will need to work a bit harder if useBytes=TRUE is not sufficient for you. On Wed, 28 Oct 2009, Johannes Graumann wrote: Hi, Is there any way of doing 'grep' ore something like it on the content of a text file and extract the byte positioning of the match in the file? I'm facing the need to access rather largish (600MB) XML files and would like to be able to index them ... Thanks for any help or flogging, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regex matching that gives byte offset?
Hi, Is there any way of doing 'grep' ore something like it on the content of a text file and extract the byte positioning of the match in the file? I'm facing the need to access rather largish (600MB) XML files and would like to be able to index them ... Thanks for any help or flogging, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regex matching that gives byte offset?
Do you mean like regexpr() (on the same help page)? Depending on your locale, you might actually prefer the character offset: if you want to match in a MBCS and have byte offsets you will need to work a bit harder if useBytes=TRUE is not sufficient for you. On Wed, 28 Oct 2009, Johannes Graumann wrote: Hi, Is there any way of doing 'grep' ore something like it on the content of a text file and extract the byte positioning of the match in the file? I'm facing the need to access rather largish (600MB) XML files and would like to be able to index them ... Thanks for any help or flogging, Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.