Yeah that sounds good. Chandni
On Mon, Dec 28, 2015 at 2:46 AM, Bhupesh Chawda <[email protected]> wrote: > Thanks Chandni for the comments. > > Additionally I think, we should do away with constructing the SQL query in > the operator. This is because it is only possible in case of simple insert > statements. In case a where clause is needed in the insert statement, we > cannot construct the SQL easily. If we are accepting a parametrized query > from the user for update/merge, why not do the same for insert statements > as well? Then the hierarchy would look like: > - AbstractJdbcPOJOOutputOperator (As suggested) > -- JdbcPOJOOutputOperator (Takes care of all statements insert, update, > merge, delete) > > We don't need a separate operator for inserts and update/merge. > > Thanks. > -Bhupesh > > > On Mon, Dec 28, 2015 at 3:32 PM, Chandni Singh <[email protected]> > wrote: > > > 1. FieldInfo, is ultimately a custom object and any user who uses this > > operator has to construct an object, populate it and then use it. We > are > > trying to avoid using any custom object and allow any user to use the > > operator without writing any extra code; just configuration. > > FieldInfo is a way to provide configuration in a UI friendly way. > Providing > > configuration as JSON is not UI friendly. > > > > 2. In case of update / merge, we need what SQL data types that the > > expression provided by the user would evaluate to. In case of insert > we > > can > > go and fetch the data types of the columns directly from DB. However, > > the > > same is not possible for custom expressions; "avg(salary)" for > instance. > > Ok so here is where you can make a change. > > - JDBCPojoOutput can be renamed to AbstractJDBCPojoOutpuOperator. > > - Abstraction is to fetch the type of column. > > - Add a concrete JDBCPojoInsertOutput that derives the types of columns > > directly from DB. Please note that FieldInfo can also provide type of > > derived column. > > > > > > You mentioned "We are trying to avoid using any custom object and allow > any > > user to use the > > operator without writing any extra code". > > This I think is specific to your use case. You can create an extension of > > the above which takes JSON blob and creates FieldInfos from it. > > > > Chandni > > > > > > > > > > > > On Mon, Dec 28, 2015 at 1:48 AM, Bhupesh Chawda <[email protected] > > > > wrote: > > > > > Hi Chandni, > > > > > > Following are the issues: > > > > > > 1. FieldInfo, is ultimately a custom object and any user who uses > this > > > operator has to construct an object, populate it and then use it. We > > are > > > trying to avoid using any custom object and allow any user to use > the > > > operator without writing any extra code; just configuration. > > > 2. In case of update / merge, we need what SQL data types that the > > > expression provided by the user would evaluate to. In case of insert > > we > > > can > > > go and fetch the data types of the columns directly from DB. > However, > > > the > > > same is not possible for custom expressions; "avg(salary)" for > > instance. > > > > > > Thanks. > > > > > > -Bhupesh > > > > > > > > > On Mon, Dec 28, 2015 at 2:57 PM, Chandni Singh < > [email protected]> > > > wrote: > > > > > > > Hi Bhupesh, > > > > > > > > JDBCPojoOutputOperator was written for a demo and therefore it was > > marked > > > > Evolving which is why I had mentioned that you should feel free to > > modify > > > > it. > > > > > > > > I think an insert query can be as complex as any other query. It uses > > > > FieldInfo because in the app builder it is easy for the user to > provide > > > > that instead of JSON String. > > > > > > > > Can you please provide specifics about what it is that you find > > difficult > > > > to change/implement for providing update/merge/delete support in > > > > JDBCPojoOutputOperator? > > > > > > > > Chandni > > > > > > > > On Mon, Dec 28, 2015 at 1:09 AM, Bhupesh Chawda < > > [email protected] > > > > > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > It has been pointed out that adding another class for handling > > update / > > > > > merge queries would not be a good option. > > > > > > > > > > Here are the current implementation details: > > > > > > > > > > - We have an existing class: JdbcPOJOOutputOperator which > accepts > > a > > > > list > > > > > of FieldInfo objects. Each element of this list indicates the > > > details > > > > > (column name, pojo field expression and datatype) of fields that > > > need > > > > > to be > > > > > inserted. Using this, the operator formulates the insert query > in > > > the > > > > > setup > > > > > method and identifies the sql datatypes of these columns from > the > > > > > database > > > > > using the table name. > > > > > > > > > > > > > > > - Now, coming to the update / merge feature, it is difficult to > > > > > formulate the update / merge query in the operator logic due to > > the > > > > > complex > > > > > structure of these statements. For this reason, we plan to take > a > > > > > parametrized SQL query from the user. This may look like: > *"update > > > > table > > > > > set x = ?, y = ? where z + w > ? and a == 1;"*. Such statements > > can > > > be > > > > > accepted from the user in addition to a json string which > > indicates > > > > the > > > > > details for the parameters: *column name, the pojo expression > and > > > the > > > > > sql data type* of the expression. Note that this information is > > > > similar > > > > > to the FieldInfo object, but is a string which can be configured > > > > easily > > > > > by > > > > > the user. Also note that the data type which is accepted is the > > SQL > > > > data > > > > > type. Using this info we can populate the parametrized query and > > run > > > > it > > > > > based on the incoming POJO. > > > > > > > > > > The second approach is able to handle all kinds of queries (insert > / > > > > update > > > > > / merge / delete). However, since we already have the > > > > > JdbcPOJOOutputOperator, we would like to merge the new > functionality > > > into > > > > > the same class. > > > > > > > > > > Here we have the following options: > > > > > > > > > > 1. Change the existing class (JdbcPOJOOutputOperator) to the > > second > > > > > approach which is more generic and also handles inserts. > > > > > 2. Add the update/ merge functionality to the existing class > > without > > > > > changing the existing functionality. This will have two > different > > > ways > > > > > that > > > > > insert queries may be handled in the operator. > > > > > 3. Add another class which extends from JdbcPOJOOutputOperator > and > > > > have > > > > > the update/merge functionality there. (This is not recommended.) > > > > > 4. Any other approach. > > > > > > > > > > Please suggest. > > > > > > > > > > Thanks. > > > > > > > > > > -Bhupesh. > > > > > > > > > > > > > > > On Mon, Dec 14, 2015 at 11:04 AM, Chandni Singh < > > > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > No I don't think we are restricting Malhar to just abstract > > classes. > > > > > > Whenever they are couple of use cases that we see quite often, we > > add > > > > > > concrete implementations. > > > > > > > > > > > > For eg. FileLineInputOperator which is a concrete implementation > of > > > > > > AbstractFileInputOperator. FSSliceReader is an example as well. > > > > > > > > > > > > In AbstractJdbcOutputOperator case there hasn't been such common > > > > > > insert/update query. > > > > > > > > > > > > Also if you look at the example I provided, it is very simple to > > > > provide > > > > > a > > > > > > concrete implementation. > > > > > > > > > > > > If you would like to change JdbcPOJOOutputOperator to work for > > > > > > "UPDATE/MERGE" then please go ahead. > > > > > > > > > > > > Chandni > > > > > > > > > > > > On Sun, Dec 13, 2015 at 8:47 PM, Bhupesh Chawda < > > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > I see. So, just to understand more, do we plan to keep Malhar > > > > > restricted > > > > > > to > > > > > > > the base functionality (as in abstract classes)? And put the > > > > > > configuration > > > > > > > aspect / concrete implementations in apps that use these > > operators? > > > > > > > > > > > > > > Thanks. > > > > > > > Bhupesh > > > > > > > > > > > > > > On Sat, Dec 12, 2015 at 5:43 AM, Chandni Singh < > > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Here is an example of doing Upsert with JDBC: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/chandnisingh/Malhar/blob/examples/apps/jdbc/src/main/java/com/datatorrent/jdbc/JdbcWriter.java > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Chandni > > > > > > > > > > > > > > > > On Fri, Dec 11, 2015 at 11:19 AM, Chandni Singh < > > > > > > [email protected] > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > The operators are under Malhar/lib/db/jdbc. > > > > > > > > > > > > > > > > > > Here is one of them: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/apache/incubator-apex-malhar/blob/devel-3/library/src/main/java/com/datatorrent/lib/db/jdbc/AbstractJdbcTransactionableOutputOperator.java > > > > > > > > > > > > > > > > > > They work with any kind PreparedStatement - insert or > update > > > > > > > > > > > > > > > > > > Chandni > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Dec 11, 2015 at 10:59 AM, Bhupesh Chawda < > > > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> Hi Chandni, > > > > > > > > >> > > > > > > > > >> I don't see an update query being handled in the operator. > > > Could > > > > > you > > > > > > > > >> please > > > > > > > > >> point me to the appropriate class? > > > > > > > > >> Or did you mean that handling a update query is just a > > matter > > > of > > > > > > > > extending > > > > > > > > >> the class and providing a concrete implementation? > > > > > > > > >> > > > > > > > > >> Thanks. > > > > > > > > >> -Bhupesh > > > > > > > > >> > > > > > > > > >> On Fri, Dec 11, 2015 at 10:42 PM, Chandni Singh < > > > > > > > > [email protected]> > > > > > > > > >> wrote: > > > > > > > > >> > > > > > > > > >> > Hi Bhupesh, > > > > > > > > >> > > > > > > > > > >> > The current abstract JDBC Output Operators in library > are > > > > > generic > > > > > > > and > > > > > > > > >> have > > > > > > > > >> > already been used in multiple POCs and applications. In > > fact > > > > > this > > > > > > > > >> operator > > > > > > > > >> > has matured through customer use cases. It is not just > an > > > > insert > > > > > > > > >> operator. > > > > > > > > >> > We have used it to perform update and inserts. > > > > > > > > >> > > > > > > > > > >> > That said, I don't think it is a good idea to introduce > > > input > > > > > > format > > > > > > > > in > > > > > > > > >> > these abstract implementations. It is written to handle > > any > > > > > type > > > > > > of > > > > > > > > >> query, > > > > > > > > >> > be it a procedure call (that was an actual customer use > > > case). > > > > > > > > >> > > > > > > > > > >> > Chandni > > > > > > > > >> > > > > > > > > > >> > On Fri, Dec 11, 2015 at 2:50 AM, Bhupesh Chawda < > > > > > > > > >> [email protected]> > > > > > > > > >> > wrote: > > > > > > > > >> > > > > > > > > > >> > > Hi All, > > > > > > > > >> > > > > > > > > > > >> > > We are planning to proceed with the following approach > > for > > > > > JDBC > > > > > > > > >> *update* > > > > > > > > >> > > operator: > > > > > > > > >> > > > > > > > > > > >> > > - *Update Query Configuration* > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > - Example Update Query: *update tableName set a = > ?** > > > > > where b > > > > > > > = ? > > > > > > > > >> and > > > > > > > > >> > c > > > > > > > > >> > > > ?;* > > > > > > > > >> > > - Example JSON input array for parameter > > > instantiations: > > > > > > *[{a, > > > > > > > > >> > > expression, INTEGER}, {b, expression, VARCHAR}, {c, > > > > > > expression, > > > > > > > > >> > DATE}]* > > > > > > > > >> > > > > > > > > > > >> > > We are also planning to change the JDBC Output > Operator > > in > > > > > > Malhar > > > > > > > > >> Library > > > > > > > > >> > > which currently does just insert. We plan to make the > > > input > > > > > > format > > > > > > > > >> > > consistent for both insert and update and hence the > > change > > > > to > > > > > > the > > > > > > > > >> current > > > > > > > > >> > > way of configuration using JSON. Following would be > the > > > > config > > > > > > for > > > > > > > > >> > inserts: > > > > > > > > >> > > > > > > > > > > >> > > - *Insert Query Configuration* > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > - Example Insert Query: *insert into tableName > values > > > (?, > > > > > ?, > > > > > > > .. , > > > > > > > > >> ?);* > > > > > > > > >> > > - Example JSON input array for parameter > > > instantiations: > > > > > > *[{a, > > > > > > > > >> > > expression, INTEGER}, {b, expression, VARCHAR}, .. > , > > > {c, > > > > > > > > >> expression, > > > > > > > > >> > > DATE}]* > > > > > > > > >> > > > > > > > > > > >> > > Please let us know your thoughts. > > > > > > > > >> > > > > > > > > > > >> > > Thanks. > > > > > > > > >> > > > > > > > > > > >> > > -Bhupesh > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > On Wed, Dec 9, 2015 at 6:38 PM, Bhupesh Chawda < > > > > > > > > >> [email protected]> > > > > > > > > >> > > wrote: > > > > > > > > >> > > > > > > > > > > >> > > > Hi All, > > > > > > > > >> > > > > > > > > > > > >> > > > Would it be a good idea to introduce the update > > > > > functionality > > > > > > to > > > > > > > > the > > > > > > > > >> > JDBC > > > > > > > > >> > > > output operator in Apache Apex Malhar library. > > > > > > > > >> > > > > > > > > > > > >> > > > The following are possible approaches: > > > > > > > > >> > > > > > > > > > > > >> > > > 1. Accept a update query from the user with place > > > > holders > > > > > > for > > > > > > > > >> > values. > > > > > > > > >> > > > Example: *update tableName set a = ?, b = ? > where c > > > = ? > > > > > and > > > > > > > d > > > > > > > > > >> ?*. > > > > > > > > >> > > > Here "?" will be provided by the user as java > > > > expressions > > > > > > > which > > > > > > > > >> will > > > > > > > > >> > > be > > > > > > > > >> > > > evaluated from the incoming tuple. > > > > > > > > >> > > > 2. Another option is to accept in some > > configuration > > > > > format > > > > > > > > >> (json / > > > > > > > > >> > > > xml) the following and formulate the query in the > > > > > operator. > > > > > > > > This > > > > > > > > >> can > > > > > > > > >> > > become > > > > > > > > >> > > > arbitrarily complex. > > > > > > > > >> > > > 1. update clause columns > > > > > > > > >> > > > 2. update clause expressions > > > > > > > > >> > > > 3. where clause columns > > > > > > > > >> > > > 4. where clause expressions > > > > > > > > >> > > > > > > > > > > > >> > > > I am thinking about going ahead with 1. Please let > me > > > know > > > > > if > > > > > > > any > > > > > > > > >> other > > > > > > > > >> > > > option is possible and whether such a functionality > > > > already > > > > > > > exists > > > > > > > > >> in > > > > > > > > >> > > some > > > > > > > > >> > > > other class. > > > > > > > > >> > > > > > > > > > > > >> > > > Thanks. > > > > > > > > >> > > > > > > > > > > > >> > > > -Bhupesh > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
