Hey Eugene, 

Thanks for this, didn’t realize this was a parameter I could tune. Fixed my 
problems straight away. 

Chet

> On Nov 29, 2017, at 2:14 PM, Eugene Kirpichov <[email protected]> wrote:
> 
> Hi,
> I think you're hitting something that can be fixed by configuring Redshift 
> driver:
> http://docs.aws.amazon.com/redshift/latest/dg/queries-troubleshooting.html#set-the-JDBC-fetch-size-parameter
>  
> <http://docs.aws.amazon.com/redshift/latest/dg/queries-troubleshooting.html#set-the-JDBC-fetch-size-parameter>
> By default, the JDBC driver collects all the results for a query at one time. 
> As a result, when you attempt to retrieve a large result set over a JDBC 
> connection, you might encounter a client-side out-of-memory error. To enable 
> your client to retrieve result sets in batches instead of in a single 
> all-or-nothing fetch, set the JDBC fetch size parameter in your client 
> application.
> 
> On Wed, Nov 29, 2017 at 1:41 PM Chet Aldrich <[email protected] 
> <mailto:[email protected]>> wrote:
> Hey all, 
> 
> I’m running a Dataflow job that uses the JDBC IO transform to pull in a bunch 
> of data (20mm rows, for reference) from Redshift, and I’m noticing that I’m 
> getting an OutofMemoryError on the Dataflow workers once I reach around 4mm 
> rows. 
> 
> It seems like given the code that I’m reading inside JDBC IO and the guide 
> here 
> (https://beam.apache.org/documentation/io/authoring-overview/#read-transforms 
> <https://beam.apache.org/documentation/io/authoring-overview/#read-transforms>)
>  that it’s just pulling the data in from the result one-by-one and the 
> emitting each output. Considering that this is sort of a limitation of the 
> driver, this makes sense, but is there a way I can get around the memory 
> limitation somehow? It seems like Dataflow repeatedly tries to create more 
> workers to handle the work, but it can’t, which is part of the problem. 
> 
> If more info is needed in order to help me sort out what I could do to not 
> run into the memory limitations I’m happy to provide it. 
> 
> 
> Thanks,
> 
> Chet 

Reply via email to