Why do you need an access to the internal bufio.Reader? If you provide a bufio.Reader to bufio.NewReader, then it will NOT create a new reader, but give back your reader. So if you keep your bufio.Reader, and give it to csv.NewReader, than you will have the same *bufio.Reader as what the csv.Reader's inner r !
Severyn Lisovsky a következőt írta (2020. október 31., szombat, 18:31:34 UTC+1): > Tamás Gulácsi, this was basically my initial idea to do that, but > unfortunately there is no access to internal bufio.Reader. See: > https://golang.org/src/encoding/csv/reader.go#L170 > > peterGo, my file is ~100GB so downloading it just for sake of splitting > doesn't make sense to me. I want for each worker to make use of > NewRangeReader > method > <https://godoc.org/cloud.google.com/go/storage#ObjectHandle.NewRangeReader> > to > download only related piece of the file. > > ren...@ix.netcom.com ByteCount reader that wraps the underlying reader > wouldn't help because csv.Reader doesn't read from underlying reader > synchronically, it reads from bufio.Reader which buffers the bytes. So for > example if you read 1 row from CSV (eg. 10 bytes) from underlying io.Reader > will be 4096 bytes read. On the next csv.Reader.Read() call none of bytes > will be read from underlying io.Reader because it will take next row out of > the buffer > > On Saturday, October 31, 2020 at 6:02:32 PM UTC+1 Tamás Gulácsi wrote: > >> Give csv.NewReader your own *bufio.Reader. >> Regarding (https://pkg.go.dev/pkg/bufio/#NewReaderSize) if the >> underlying io.Reader is already a *bufio.Reader with a big enough size (and >> csv.NewReader uses the default 4k), >> then the underlying reader is used, no new wrapping is introduced. >> >> This way if you use >> cr := countingReader{Reader:r} >> br := bufio.NewReader(cr) >> csvR := csv.NewReader(br) >> >> then cr.N - br.Buffered() is the number of bytes read by csv.Reader, the >> end of the last line read. >> >> Hope this helps. >> >> Severyn Lisovsky a következőt írta (2020. október 31., szombat, 3:17:26 >> UTC+1): >> >>> Hi, >>> >>> I have difficulty counting bytes that were processed by csv.Reader >>> because it reads from internally created bufio.Reader. If I pass some >>> counting reader to csv.NewReader it will show not the actual number bytes >>> "processed" by csv.Reader to receive the output I get calling >>> csv.Reader.Read method, but the number of bytes copied to bufio.Reader's >>> buffer internally (some bytes may be read during next csv.Reader.Read call >>> from the buffer). >>> >>> Is there a way I can deal with this issue by not forking encoding/csv >>> package? >>> >>> To give you more high-level picture - I want to split remote csv file to >>> chunks. Each chunk should be standalone csv file - starting from actual >>> beginning of the line, ending with newline byte. So I'm trying to do the >>> following - split file size by the number of chunks, and for each chunk - >>> skip first bytes up to newline symbol and read to offset+chunkSize+[number >>> of bytes to the next newline symbol] >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/bb5c7bfa-f604-4656-b335-a2e7c6682115n%40googlegroups.com.