Why do you need an access to the internal bufio.Reader?

If you provide a bufio.Reader to bufio.NewReader, then it will NOT create a 
new reader, but give back your reader.
So if you keep your bufio.Reader, and give it to csv.NewReader, than you 
will have the same *bufio.Reader 
as what the csv.Reader's inner r !

Severyn Lisovsky a következőt írta (2020. október 31., szombat, 18:31:34 
UTC+1):

> Tamás Gulácsi, this was basically my initial idea to do that, but 
> unfortunately there is no access to internal bufio.Reader. See:
> https://golang.org/src/encoding/csv/reader.go#L170
>
> peterGo, my file is ~100GB so downloading it just for sake of splitting 
> doesn't make sense to me. I want for each worker to make use of 
> NewRangeReader 
> method 
> <https://godoc.org/cloud.google.com/go/storage#ObjectHandle.NewRangeReader> 
> to 
> download only related piece of the file. 
>
> ren...@ix.netcom.com ByteCount reader that wraps the underlying reader 
> wouldn't help because csv.Reader doesn't read from underlying reader 
> synchronically, it reads from bufio.Reader which buffers the bytes. So for 
> example if you read 1 row from CSV (eg. 10 bytes) from underlying io.Reader 
> will be 4096 bytes read. On the next csv.Reader.Read() call none of bytes 
> will be read from underlying io.Reader because it will take next row out of 
> the buffer
>
> On Saturday, October 31, 2020 at 6:02:32 PM UTC+1 Tamás Gulácsi wrote:
>
>> Give csv.NewReader your own *bufio.Reader. 
>> Regarding (https://pkg.go.dev/pkg/bufio/#NewReaderSize) if the 
>> underlying io.Reader is already a *bufio.Reader with a big enough size (and 
>> csv.NewReader uses the default 4k),
>> then the underlying reader is used, no new wrapping is introduced.
>>
>> This way if you use 
>>   cr := countingReader{Reader:r}  
>>   br := bufio.NewReader(cr)
>>   csvR := csv.NewReader(br)
>>
>> then cr.N - br.Buffered() is the number of bytes read by csv.Reader, the 
>> end of the last line read.
>>
>> Hope this helps.
>>
>> Severyn Lisovsky a következőt írta (2020. október 31., szombat, 3:17:26 
>> UTC+1):
>>
>>> Hi,
>>>
>>> I have difficulty counting bytes that were processed by csv.Reader 
>>> because it reads from internally created bufio.Reader. If I pass some 
>>> counting reader to csv.NewReader it will show not the actual number bytes 
>>> "processed" by csv.Reader to receive the output I get calling 
>>> csv.Reader.Read method, but the number of bytes copied to bufio.Reader's 
>>> buffer internally (some bytes may be read during next csv.Reader.Read call 
>>> from the buffer).
>>>
>>> Is there a way I can deal with this issue by not forking encoding/csv 
>>> package?
>>>
>>> To give you more high-level picture - I want to split remote csv file to 
>>> chunks. Each chunk should be standalone csv file - starting from actual 
>>> beginning of the line, ending with newline byte. So I'm trying to do the 
>>> following - split file size by the number of chunks, and for each chunk - 
>>> skip first bytes up to newline symbol and read to offset+chunkSize+[number 
>>> of bytes to the next newline symbol]
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/bb5c7bfa-f604-4656-b335-a2e7c6682115n%40googlegroups.com.

Reply via email to