A Drill session is isolated and bound to a connection. Your 'getConnection()' 
method might be fetching connections from a pool, where the settings haven't 
been reset. If the connections are shared, you will continue to have this 
problem.

If you are returning a connection back to the pool, run the RESET command to 
ensure the default state is set.

https://drill.apache.org/docs/reset/



-----Original Message-----
From: Rahul Raj [mailto:[email protected]] 
Sent: Wednesday, December 13, 2017 2:17 AM
To: [email protected]
Subject: Drill session and jdbc connections

Hi,

How is a drill session related to a drill jdbc connection instance? What 
happens in a pool of connections when one connection changes the store.format? 
I am seeing some mix-ups where a parquet row is written as an array of multiple 
records(rather than multiple columns) when another thread tries to create a csv 
file. This happens only during the race condition between CSV and parquet 
formats.

Scenario:

Thread 1 for CSV creation:

Connection conn = getConnection();
conn.execute("ALTER SESSION SET `store.format`='csv'") conn.execute("CREATE 
TABLE someparquet AS ...") conn.execute("ALTER SESSION SET 
`store.format`='parquet'")

Thread 2 for parquet creation:

Connection conn = getConnection();
conn.execute("CREATE TABLE somecsv AS ...")

In thread 2, the parquet gets written as an ARRAY with all the fields because 
of the side effect of Thread 1 setting format as CSV when they execute in 
parallel.

Is it possible to have session isolation in this situation?

Regards,
Rahul

--
**** This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom it is addressed. If you 
are not the named addressee then you should not disseminate, distribute or copy 
this e-mail. Please notify the sender immediately and delete this e-mail from 
your system.****

Reply via email to