Hi YU-MING, It seems ResultSet is not thread-safe in general, so we do not provide a parallel JDBC adapter. However, the Oracle implementation seems to be thread-safe, so you can implement your own parallel adapter.
Best, Liya Fan HSU YU-MING <[email protected]> 于2021年9月9日周四 下午8:18写道: > Hi All: > > I am trying to fetch Oracle's data, and transfer it to arrow. Below is my > code snippet, Regarding to "[ jvm.record_batch(arrow_vector) for > arrow_vector in arrow_vector_iterator]" this line, are there any way i > can parallel in order to speed up / gain more performance? > > > > def select_pyarrow_jvm(query): > start = time.time() > > stmt = jdbc_sql_connection.createStatement() > result_set = stmt.executeQuery(query) > > > try: > arrow_vector_iterator = > jpype.JPackage("org").apache.arrow.adapter.jdbc.JdbcToArrow.sqlToArrowVectorIterator( > result_set, > ra > ) > record_batch_list = [ jvm.record_batch(arrow_vector) for arrow_vector > in arrow_vector_iterator] > data_arrow_tbl = pa.Table.from_batches(record_batch_list) > df = data_arrow_tbl.to_pandas() > except Exception as e: > logging.exception("Error inside select_pyarrow_jvm ") > > finally: > # Ensure that we clear the JVM memory. > stmt.close() > elapse = time.time() - start > return df,elapse > > > Many Thanks, > > Abe > >
