ostinru opened a new issue, #74:
URL: https://github.com/apache/cloudberry-pxf/issues/74

   I am going to test that cloudberry-pxf works as good as greenplum-pxf with 
JDBC data sources (and update docs).
   
   Currently, Automation covers only PostgreSQL JDBC (because postgresq JDBC 
driver is included into pxf build and we can use Cloudberry as PostgreSQL 
instance). However most issues we observe in production happens with other 
databases (usually Oracle and ClickHouse). Oracle uses weird data types that 
doesn't match PostgreSQL ones, ClickHouse used to have issues with huge volumes 
of data users going to export.
   
   Here I see couple issues:
   1. Oracle licensing is not clear for me. I am not sure that we can run 
containerised Oracle in CI / dev machines. I have seen that TestContainers 
provides[1] Oracle as one of the options. And it is stated that this container 
is used across different open-source projects[2]. Here I need guidance and best 
practices from @tuhaihe  / Apache.
   2. MS SQL Server also requires accepting EULA before running container with 
a database [3]. Is it OK?
   
   And, is it possible to download these docker images from "special network 
environments" (#63)?
   
   ### Test Design ideas
   
   **Dependencies**:
   I am not sure that we want to provide "~~batteries~~ drivers included" with 
`cloudberry-pxf`, or even as `cloudberry-pxf-driveres`[4]. It will be an 
obligation to support different databases for ages (as we are doing with 
HBase). However we can keep these drivers as test dependencies for `pxf-jdbc`.
   
   **TestContainers**:
   I think that we can start Cloudberry + PXF container 
(`ci/docker/pxf-cbdb-dev`) directly from java code in TestContainers with 
shared network with 3rd party databases. I know, that this will run SLOW (slow 
Cloudberry compilation, PXF build, Hadoop start). But it seems to be step into 
right direction.
   
   ```
   +----------------------------------------------------------------+
   |                              Host                              |
   |  +-------------------------------+  +-----------------------+  |
   |  |           Docker              |  |        Docker         |  |
   |  |   [Cloudberry] --> [PXF] ------------> [Database]        |  |
   |  |                               |  |                       |  |
   |  +-------------------------------+  +-----------------------+  |
   |                                                                |
   |                         [Automation]                           |
   +----------------------------------------------------------------+
   ```
   
   @MisterRaindrop , any thoughts on this?
   
   [1] https://java.testcontainers.org/modules/databases/oraclefree/
   [2] 
https://github.com/gvenzl/oci-oracle-free?tab=readme-ov-file#users-of-these-images
   [3] https://java.testcontainers.org/modules/databases/mssqlserver/
   [4] https://github.com/open-gpdb/cloudberry-pxf/pull/6


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to