Hey,

In this email I share a summarized conversation, which started on the
Apache Wayang Slack channel.

Context:

I'm trying to use data from **three different databases** using the
`GenericJdbcTableSource`, which allows me to specify the configurations for
all three databases, as shown in the example below:

```java
Configuration configuration = new Configuration();
configuration.setProperty("wayang.pgsql1.jdbc.url",
"jdbc:postgresql://localhost:5434/apache_wayang_test_db");
configuration.setProperty("wayang.pgsql1.jdbc.user", "postgres");
configuration.setProperty("wayang.pgsql1.jdbc.password", "password");
configuration.setProperty("wayang.pgsql1.jdbc.driverName",
"org.postgresql.Driver");

TableSource ts1 = new GenericJdbcTableSource("pgsql1", "local_averages");
DataQuantaBuilder<?, Record> jdbcSource1 = planBuilder.readTable(ts1);
```

However, when trying to run the job, I consistently encounter a **"no plan"
error**.

If I use the **TPCH example** as a starting point, I can only access **one
Postgres DB**, but I need to access **multiple databases** with individual
configurations. Currently, I don't see a way to provide separate
configuration objects when using `PostgresTableSource`:

```java
DataQuantaBuilder<?, Record> jdbcSource = planBuilder.readTable(
    new PostgresTableSource("local_averages")
);
```

Is there a missing feature here, or should there be a way to pass a
`Configuration` object directly into the constructor of
`PostgresTableSource`?

---

### Progress and Debugging:

I’ve managed to perform a **3-table join** without using `JavaPlanBuilder`,
following the **SQLtes2 demo** approach. This is a good starting point, but
I wanted to stick with **Juri's recommendation** to use the methods from
`PlanBuilder`, instead of manually linking operators.

For those interested, here’s a link to my current work branch:
[My branch on GitHub](
https://github.com/kamir/incubator-wayang/tree/kamir-patch-03)

---

### Examples:

Here are some examples of what I have working and what fails:

1. **Working**: Three sources manually linked via operators.
   - File: `org.apache.wayang.applications.demo1.Job123C`

2. **Working**: TPCH example using a single database.
   - File: `org.apache.wayang.applications.Tpch`

3. **Failing**: `PeerGroupComparison2.java`
   - The job fails with the following error:
   ```
   Exception in thread "main"
org.apache.wayang.core.api.exception.WayangException: No implementations
that concatenate out@GenericJdbcTableSource...
   ```

The full stack trace and code are available in the following file:
[PeerGroupComparison2.java](
https://github.com/apache/incubator-wayang/blob/3b745955c497583077e9c8ad739e2db325/src/main/java/org/apache/wayang/applications/PeerGroupComparison2.java
)

---

### Pull Request:

For anyone interested in helping with this issue, here’s the link to my
pull request:

[PR #473 - Demo cluster configuration for Apache Wayang](
https://github.com/apache/incubator-wayang/pull/473)

This PR is still in **draft mode**, and I’m using it for debugging
purposes, but feel free to check it out or offer suggestions.

---

**Additional Working Examples**:

1. **Working**: `PeerGroupComparison.java`
   - File: [PeerGroupComparison.java](
https://github.com/apache/incubator-wayang/blob/3b745955c497583077e9c8ad739e2db325/src/main/java/org/apache/wayang/applications/PeerGroupComparison.java
)

2. **Working**: `Job123C.java`
   - File: [Job123C.java](
https://github.com/apache/incubator-wayang/blob/3b745955c497583077e9c8ad739e2db325/src/main/java/org/apache/wayang/applications/demo1/Job123C.java
)

---

Let me know if anyone has insights into how we can handle the **multiple
Postgres databases with separate configurations** using `PlanBuilder`, or
if there is a different approach I should consider.

Best,
Mirko

-- 
Mirko Kämpf
*PPMC Apache Wayang*

Reply via email to