I did post my patch just now, which converts ColumnType to an interface: https://reviews.apache.org/r/20028/
Would be interesting if someone can check it out and give it some thought. Not a trivial change for backwards compatibility I think, but might be the best from an API standpoint. 2014-04-01 20:15 GMT+02:00 Kasper Sørensen <[email protected]>: > Hi Ankit, > > I think it is fair to assume that a CLOB will always be convertible to a > String - if memory on the client permits at least. Normally a java.sql.Clob > provides a getCharacterStream() method which returns a Reader. By setting > the system property 'metamodel.jdbc.convert.lobs' to "true" we offer the > service of eagerly reading that Reader to the end and returning a String. > The good thing about that is that the client avoids issues of the DataSet > is closed or even sometimes just moving the cursor to the next record may > invalidate the Reader. In other words: With the LOB conversion we offer a > way to make the MetaModel Row objects useable even though the connection to > the statement/resultset/connection is closed. It's of course optional. > > The same idea is applied to BLOBs. We offer to automatically open the > java.sql.Blob InputStream and load it into a byte-array. > > In my opinion the fact that something is a CLOB or a VARCHAR is a > different type of information than whether it's values are represented as > java.sql.Clob or String objects. The information on the column type is > valuable in it's own right and I would not want us to "make believe" that a > column which is *really *a CLOB is shown as a VARCHAR. > > Now that we're on the topic, we might even also consider if using the JDBC > data type names is even appropriate as a default behaviour. I mean, we're > acting in the Java world and a type like VARCHAR or TINYINT etc. is only > used in SQL, not in native Java apps. Other types of datastores have > different data types and I guess many of the datastores only have simple > data types like String, Integer, Date etc. We could choose to go for a > non-enum solution for the ColumnType and thereby also provide more > flexibility. In the CSV module we could thus provide just a single > ColumnType, e.g.: > > public static final ColumnType CSV_COLUMN_TYPE = new > ColumnTypeImpl("String", String.class); > > Read the example constructor arguments as this: Name would be "String" (or > maybe "Text" is more appropriate for "text files"?) and expected Java class > is String.class > > In the JDBC module we could have other column types available, e.g.: > > public static final ColumnType VARCHAR = new ColumnTypeImpl("VARCHAR", > String.class); > public static final ColumnType CLOB = new ColumnTypeImpl("CLOB", > java.sql.Clob.class); > public static final ColumnType CLOB_CONVERTED = new > ColumnTypeImpl("CLOB", String.class); > public static final ColumnType BLOB = new ColumnTypeImpl("BLOB", > java.sql.Blob.class); > public static final ColumnType BLOB_CONVERTED = new > ColumnTypeImpl("BLOB", byte[].class); > > (and there you then have my suggested solution for solving the Clob issue > - to use a String for the name and a Class for the expected Java type) > > I can try to make a patch for this solution. That would sort of validate > if it will work or not. But first ... what do you guys think? > > Only pitfall I see is that it breaks backwards compatibility. I think we > can retain most of the compile-time compatibility, but not runtime > compatibility (as in "drop in a new jar without recompile"). Also, old > objects serialized with the old enums would not be properly deserialized > into using these constants. Maybe that can be solved in > the LegacyDeserializationObjectInputStream [1] somehow. > > Best regards, > Kasper > > > [1] Described here: > http://wiki.apache.org/metamodel/MigratingFromEobjectsMetaModel > > > 2014-04-01 10:28 GMT+02:00 Ankit Kumar <[email protected]>: > > Hi Kasper, >> >> Thanks for initiating this discussion as we are hit by this in MM right >> now. >> >> Thinking just a bit from a client of MM perspective, I guess the CLOB >> could >> contain a Object/File/Image/???/Large String text, so while getting CLOB's >> from the real database we still can check the content and if it smells >> more >> like a String then we return the VARCHAR otherwise an Object. May be this >> is not the neatest solution but I guess for a client this provides the >> flexibility needed. >> >> An additional point worth mentioning with respect to the above remark, >> playing with different databases we see some databases allow large text to >> be stored in VARCHAR kind of fields whereas some are still having limits >> of >> 4000 characters like Oracle. So when using MM as the abstraction layer >> above the databases this feature might be quite handy for a client. >> >> Excuse me if I sound a bit too biased here. >> >> Regards >> Ankit >> >> >> On Tue, Mar 25, 2014 at 3:54 PM, Kasper Sørensen < >> [email protected]> wrote: >> >> > Hi all, >> > >> > I just came across a potential issue in MetaModel's design and want to >> > share it and maybe start thinking of ways to fix or work around it... >> > >> > In our ColumnType enum we have the method getJavaEquivalentClass() >> which is >> > used to tell which java type to expect when querying a particular >> column. >> > For instance, of I query a VARCHAR column, I can expect a >> java.lang.String >> > value. >> > >> > Now the trouble is that in our JDBC module we have a system property >> which >> > allows for eager loading of BLOBs and CLOBs so that they are >> automatically >> > read into byte-arrays and Strings respectively. This is a great utility >> > because otherwise the user has to do a lot of tedious work with >> > inputstreams etc which in most cases isn't particularly useful - in most >> > cases you just want the byte[ ] or String. >> > >> > Now the trouble is that if you turn that system property on, you get >> > Strings or byte-arrays but the column type is still CLOB/BLOB and that >> > means that the "expected equivalent Java class" is still java.sql.Clob >> or >> > java.sql.Blob! If you build your code to expect that, then it will >> > eventually break because you get a String or a byte-array instead. >> > >> > How to make this consistent? I can think of a few ways, but none that I >> > really love: >> > >> > 1) Probably the easiest way to do it is to let the JDBC datacontext give >> > the columns other ColumnTypes. But that sorta doesn't feel good now that >> > the "real" datatype is CLOB, that we will then call it e.g. VARCHAR. >> > >> > 2) We can remove the getEquivalentJavaClass() from ColumnType and >> instead >> > make it a direct member of the Column class. This will break backwards >> > compatibility of the API. >> > >> > 3) We can make ColumnType an interface or class instead of a enum. Then >> the >> > behaviour can simply be plugged in by the JDBC DataContext. I do like >> this >> > approach the best for many reasons, but it has the downside that it also >> > breaks backwards compatibility of the API and that there will no longer >> be >> > an enumerable and finite list of ColumnTypes. Maybe we could alleviate >> that >> > problem by ALSO having an enum with the typical implementations or >> > something like that. >> > >> > Maybe there are other solutions that I didn't think of. >> > >> > Regards, >> > Kasper >> > >> > >
