FYI, this is also in master now. -- Jacques Nadeau CTO and Co-Founder, Dremio
On Wed, Jan 20, 2016 at 4:14 AM, Boris Chmiel < [email protected]> wrote: > Great ! Thanks Jacques; I expect to try your patch soon > > Le Samedi 16 janvier 2016 2h36, "[email protected]" < > [email protected]> a écrit : > > > Oh this is brilliant. I will take a look at this and give it a try. > > Let me take a moment and thank you all for this ambitious project you've > undertaken. Thanks a lot! :) > > Regards, > Rohit > > Sent from my iPhone > > > On 16-Jan-2016, at 4:11 AM, Jacques Nadeau <[email protected]> wrote: > > > > I have a fix and we should merge it shortly into master. You can see the > > progress here: > > > > https://issues.apache.org/jira/browse/DRILL-4277 > > > > Note that given the simplicity of the patch, if you are adventurous, you > > could most likely apply this patch on top of the 1.4 version of Drill if > > you didn't want to wait until the next official release. > > > > Thanks for your patience. > > > > Jacques > > > > > > -- > > Jacques Nadeau > > CTO and Co-Founder, Dremio > > > >> On Thu, Jan 14, 2016 at 5:07 PM, Jacques Nadeau <[email protected]> > wrote: > >> > >> Good catch. Reproduced now. Looking into it. > >> > >> -- > >> Jacques Nadeau > >> CTO and Co-Founder, Dremio > >> > >> On Thu, Jan 14, 2016 at 3:19 PM, Jason Altekruse < > [email protected] > >>> wrote: > >> > >>> Jacques, not sure if you caught this, in the stacktrace it mentions > >>> broadcast sender. Did the plan for your test query include a broadcast > >>> join? > >>> > >>> * (com.fasterxml.jackson.databind.JsonMappingException) Already had > POJO > >>> for id (java.lang.Integer) > >>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8] > >>> (through reference chain: > >>> org.apache.drill.exec.physical.config.BroadcastSender["destinations"])* > >>> > >>> On Thu, Jan 14, 2016 at 2:02 PM, Jacques Nadeau <[email protected]> > >>> wrote: > >>> > >>>> Hey Rohit, > >>>> > >>>> I'm having trouble reproducing this in my environment (pointing at > >>> derby + > >>>> hdfs instead of redshift/postgres). Can you turn on debug logging and > >>> then > >>>> run this query? You can enable the debug logging we are interested in > by > >>>> adding the following item to logback.xml: > >>>> > >>>> <logger > >>>> name="org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler" > >>>> additivity="false"> > >>>> <level value="debug" /> > >>>> <appender-ref ref="FILE" /> > >>>> </logger> > >>>> > >>>> This is the query that I have succesfully completing. Please confirm > it > >>> is > >>>> similar to your query: > >>>> SELECT count(*) > >>>> FROM dfs.`/data/tpch01/line/` a > >>>> RIGHT JOIN derby.DRILL_DERBY_TEST.PERSON b > >>>> ON a.cvalue = b.person_id > >>>> > >>>> > >>>> > >>>> -- > >>>> Jacques Nadeau > >>>> CTO and Co-Founder, Dremio > >>>> > >>>>> On Wed, Jan 13, 2016 at 7:36 PM, <[email protected]> wrote: > >>>>> > >>>>> Thanks a bunch Jacques! > >>>>> > >>>>> Sent from my iPhone > >>>>> > >>>>>> On 14-Jan-2016, at 12:48 AM, Jacques Nadeau <[email protected]> > >>>> wrote: > >>>>>> > >>>>>> I think it is most likely probably trivial but unfortunately haven't > >>>> had > >>>>>> the time to look at it yet. It looks like, for some reason, we're > >>>> having > >>>>> a > >>>>>> failure when serializing the query to pass around between nodes. > >>> Let me > >>>>> try > >>>>>> to take a look today. > >>>>>> > >>>>>> -- > >>>>>> Jacques Nadeau > >>>>>> CTO and Co-Founder, Dremio > >>>>>> > >>>>>> On Wed, Jan 13, 2016 at 3:17 AM, Rohit Kulkarni < > >>>>> [email protected]> > >>>>>> wrote: > >>>>>> > >>>>>>> Is this trivial, or big? > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Rohit > >>>>>>> > >>>>>>> On Thu, Jan 7, 2016 at 11:29 PM, Boris Chmiel < > >>>>>>> [email protected]> wrote: > >>>>>>> > >>>>>>>> Hi everyone,I do also have this error trying to join MSSQL data > >>>> source > >>>>>>>> with dfs parquet files. Here the stack : > >>>>>>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM > >>>>>>>> ERROR:IllegalStateException: Already had POJO for id > >>> > (java.lang.Integer)[com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>>>>>> ] > >>>>>>>> Fragment2:0 [Error Id: 8431453e-94cb-459a-bc6c-5b5508c7ff84 on > >>>>>>>> PC-PC:31010](com.fasterxml.jackson.databind.JsonMappingException) > >>>>> Already > >>>>>>>> had POJO for > >>> > id(java.lang.Integer)[com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>>>>>> ] > >>>>>>>> (throughreference chain: > >>> > org.apache.drill.exec.physical.config.BroadcastSender["destinations"])com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath():210com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath():177com.fasterxml.jackson.databind.deser.BeanDeserializerBase.wrapAndThrow():1420com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased():351com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault():1056com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject():264com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId():1028com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther():154com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize():126com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId():113com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject():84com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType():132com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize():41com.fasterxml.jackson.databind.ObjectReader._bindAndClose():1269com.fasterxml.jackson.databind.ObjectReader.readValue():896org.apache.drill.exec.planner.PhysicalPlanReader.readFragmentOperator():94org.apache.drill.exec.work.fragment.FragmentExecutor.run():227org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWorker():1145java.util.concurrent.ThreadPoolExecutor$Worker.run():615java.lang.Thread.run():745 > >>>>>>>> Caused By (java.lang.IllegalStateException) Alreadyhad POJO for id > >>> > (java.lang.Integer)[com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>>>>>> ] > >>> > com.fasterxml.jackson.annotation.SimpleObjectIdResolver.bindItem():20com.fasterxml.jackson.databind.deser.impl.ReadableObjectId.bindItem():66com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer.handleIdValue():117com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build():169com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased():349com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault():1056com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject():264com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId():1028com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther():154com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize():126com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId():113com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject():84com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType():132com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize():41com.fasterxml.jackson.databind.ObjectReader._bindAndClose():1269com.fasterxml.jackson.databind.ObjectReader.readValue():896org.apache.drill.exec.planner.PhysicalPlanReader.readFragmentOperator():94org.apache.drill.exec.work.fragment.FragmentExecutor.run():227org.apache.drill.common.SelfCleaningRunnable.run():38java.util.concurrent.ThreadPoolExecutor.runWorker():1145java.util.concurrent.ThreadPoolExecutor$Worker.run():615java.lang.Thread.run():745 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Le Jeudi 7 janvier 2016 17h08, Rohit Kulkarni < > >>>>>>>> [email protected]> a écrit : > >>>>>>>> > >>>>>>>> > >>>>>>>> Hi Jacques, > >>>>>>>> > >>>>>>>> Here is the full stack trace as you asked - > >>>>>>>> > >>>>>>>> *Error: SYSTEM ERROR: IllegalStateException: Already had POJO for > >>> id > >>>>>>>> (java.lang.Integer) > >>>>>>>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>> ]* > >>>>>>>> > >>>>>>>> *Fragment 2:0* > >>>>>>>> > >>>>>>>> *[Error Id: 57494209-04e8-4580-860d-461cf50b41f8 on > >>>>>>>> ip-x-x-x-x.ec2.internal:31010]* > >>>>>>>> > >>>>>>>> * (com.fasterxml.jackson.databind.JsonMappingException) Already > >>> had > >>>>> POJO > >>>>>>>> for id (java.lang.Integer) > >>>>>>>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>> ] > >>>>>>>> (through reference chain: > >>>> > org.apache.drill.exec.physical.config.BroadcastSender["destinations"])* > >>>>>>>> * > >>>> > com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath():210* > >>>>>>>> * > >>>> > com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath():177* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializerBase.wrapAndThrow():1420* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased():351* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault():1056* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject():264* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId():1028* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther():154* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize():126* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId():113* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject():84* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType():132* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize():41* > >>>>>>>> * > >>>> com.fasterxml.jackson.databind.ObjectReader._bindAndClose():1269* > >>>>>>>> * com.fasterxml.jackson.databind.ObjectReader.readValue():896* > >>>>>>>> * > >>> > org.apache.drill.exec.planner.PhysicalPlanReader.readFragmentOperator():94* > >>>>>>>> * > >>> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227* > >>>>>>>> * org.apache.drill.common.SelfCleaningRunnable.run():38* > >>>>>>>> * java.util.concurrent.ThreadPoolExecutor.runWorker():1145* > >>>>>>>> * java.util.concurrent.ThreadPoolExecutor$Worker.run():615* > >>>>>>>> * java.lang.Thread.run():745* > >>>>>>>> * Caused By (java.lang.IllegalStateException) Already had POJO > >>> for > >>>> id > >>>>>>>> (java.lang.Integer) > >>>>>>>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>> ]* > >>>>>>>> * > >>> com.fasterxml.jackson.annotation.SimpleObjectIdResolver.bindItem():20* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.impl.ReadableObjectId.bindItem():66* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.impl.PropertyValueBuffer.handleIdValue():117* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build():169* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased():349* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault():1056* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject():264* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithObjectId():1028* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther():154* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize():126* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId():113* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject():84* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType():132* > >>>>>>>> * > >>> > com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize():41* > >>>>>>>> * > >>>> com.fasterxml.jackson.databind.ObjectReader._bindAndClose():1269* > >>>>>>>> * com.fasterxml.jackson.databind.ObjectReader.readValue():896* > >>>>>>>> * > >>> > org.apache.drill.exec.planner.PhysicalPlanReader.readFragmentOperator():94* > >>>>>>>> * > >>> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227* > >>>>>>>> * org.apache.drill.common.SelfCleaningRunnable.run():38* > >>>>>>>> * java.util.concurrent.ThreadPoolExecutor.runWorker():1145* > >>>>>>>> * java.util.concurrent.ThreadPoolExecutor$Worker.run():615* > >>>>>>>> * java.lang.Thread.run():745 (state=,code=0)* > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Wed, Jan 6, 2016 at 9:44 PM, Jacques Nadeau < > >>> [email protected]> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Can you turn on verbose errors and post the full stack trace of > >>> the > >>>>>>>> error? > >>>>>>>>> You can enable verbose errors per the instructions here: > >>>> https://drill.apache.org/docs/troubleshooting/#enable-verbose-errors > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Jacques Nadeau > >>>>>>>>> CTO and Co-Founder, Dremio > >>>>>>>>> > >>>>>>>>>> On Wed, Jan 6, 2016 at 6:10 AM, <[email protected]> > >>> wrote: > >>>>>>>>>> > >>>>>>>>>> Any thoughts on this? I tried so many variants of this query but > >>>> same > >>>>>>>>>> error! > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Rohit > >>>>>>>>>> > >>>>>>>>>>> On 06-Jan-2016, at 12:26 AM, Rohit Kulkarni < > >>>>>>>> [email protected] > >>>>>>>>>> > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Thanks a bunch for replying! I quickly ran this - the TAGS_US > >>> data > >>>>>>> in > >>>>>>>>>> HDFS is in parquet format. > >>>>>>>>>>> > >>>>>>>>>>> select distinct typeof(cvalue) > >>>>>>>>>>> > >>>>>>>>>>> from hdfs.drill.TAGS_US; > >>>>>>>>>>> > >>>>>>>>>>> +----------+ > >>>>>>>>>>> > >>>>>>>>>>> | EXPR$0 | > >>>>>>>>>>> > >>>>>>>>>>> +----------+ > >>>>>>>>>>> > >>>>>>>>>>> | VARCHAR | > >>>>>>>>>>> > >>>>>>>>>>> +----------+ > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Same with the table in Redshift. I changed my query to > >>>> specifically > >>>>>>>>> cast > >>>>>>>>>> the columns to VARCHAR again - > >>>>>>>>>>> > >>>>>>>>>>> select count(*) > >>>>>>>>>>> > >>>>>>>>>>> from redshift.reports.public.us_tags as a > >>>>>>>>>>> > >>>>>>>>>>> join hdfs.drill.TAGS_US as b > >>>>>>>>>>> > >>>>>>>>>>> on cast(b.cvalue as varchar) = cast(a.tag_value as varchar) ; > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I see the same error again. > >>>>>>>>>>> > >>>>>>>>>>> Here is the explain plan for the query - > >>>>>>>>>>> > >>>>>>>>>>> select count(*) > >>>>>>>>>>> from hdfs.drill.TAGS_US as a > >>>>>>>>>>> join redshift.reports.public.us_tags as b > >>>>>>>>>>> on a.cvalue = b.tag_value; > >>>>>>>>>>> > >>>>>>>>>>> Error: SYSTEM ERROR: IllegalStateException: Already had POJO > >>> for > >>>> id > >>>>>>>>>> (java.lang.Integer) > >>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>>> ] > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> +------+------+ > >>>>>>>>>>> | text | json | > >>>>>>>>>>> +------+------+ > >>>>>>>>>>> | 00-00 Screen > >>>>>>>>>>> 00-01 StreamAgg(group=[{}], EXPR$0=[$SUM0($0)]) > >>>>>>>>>>> 00-02 UnionExchange > >>>>>>>>>>> 01-01 StreamAgg(group=[{}], EXPR$0=[COUNT()]) > >>>>>>>>>>> 01-02 Project($f0=[0]) > >>>>>>>>>>> 01-03 HashJoin(condition=[=($0, $1)], > >>>>>>> joinType=[inner]) > >>>>>>>>>>> 01-05 Scan(groupscan=[ParquetGroupScan > >>>>>>>>>> [entries=[ReadEntryWithPath [path=hdfs:// > >>>>>>>>>> ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/drill/TAGS_US] > >>> <http://ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/drill/TAGS_US%5D > >], > >>>>>>>>>> selectionRoot=hdfs:// > >>>>>>>>>> ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/drill/TAGS_US, > >>>>>>>> numFiles=1, > >>>>>>>>>> usedMetadataFile=false, columns=[`cvalue`]]]) > >>>>>>>>>>> 01-04 BroadcastExchange > >>>>>>>>>>> 02-01 Project(tag_value=[$2]) > >>>>>>>>>>> 02-02 Jdbc(sql=[SELECT * > >>>>>>>>>>> FROM "reports"."public"."us_tags"]) > >>>>>>>>>>> | { > >>>>>>>>>>> "head" : { > >>>>>>>>>>> "version" : 1, > >>>>>>>>>>> "generator" : { > >>>>>>>>>>> "type" : "ExplainHandler", > >>>>>>>>>>> "info" : "" > >>>>>>>>>>> }, > >>>>>>>>>>> "type" : "APACHE_DRILL_PHYSICAL", > >>>>>>>>>>> "options" : [ ], > >>>>>>>>>>> "queue" : 0, > >>>>>>>>>>> "resultMode" : "EXEC" > >>>>>>>>>>> }, > >>>>>>>>>>> "graph" : [ { > >>>>>>>>>>> "pop" : "jdbc-scan", > >>>>>>>>>>> "@id" : 0, > >>>>>>>>>>> "sql" : "SELECT *\nFROM \"reports\".\"public\".\"us_tags\"", > >>>>>>>>>>> "config" : { > >>>>>>>>>>> "type" : "jdbc", > >>>>>>>>>>> "driver" : "com.amazon.redshift.jdbc4.Driver", > >>>>>>>>>>> "url" : "", > >>>>>>>>>>> "username" : "", > >>>>>>>>>>> "password" : "", > >>>>>>>>>>> "enabled" : true > >>>>>>>>>>> }, > >>>>>>>>>>> "userName" : "", > >>>>>>>>>>> "cost" : 0.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "project", > >>>>>>>>>>> "@id" : 131073, > >>>>>>>>>>> "exprs" : [ { > >>>>>>>>>>> "ref" : "`tag_value`", > >>>>>>>>>>> "expr" : "`tag_value`" > >>>>>>>>>>> } ], > >>>>>>>>>>> "child" : 0, > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 100.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "broadcast-exchange", > >>>>>>>>>>> "@id" : 65540, > >>>>>>>>>>> "child" : 131073, > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 100.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "parquet-scan", > >>>>>>>>>>> "@id" : 65541, > >>>>>>>>>>> "userName" : "XXXX", > >>>>>>>>>>> "entries" : [ { > >>>>>>>>>>> "path" : "hdfs:// > >>>>>>>>>> ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/drill/TAGS_US" > >>>>>>>>>>> } ], > >>>>>>>>>>> "storage" : { > >>>>>>>>>>> "type" : "file", > >>>>>>>>>>> "enabled" : true, > >>>>>>>>>>> "connection" : "hdfs:// > >>>>>>>>>> ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/", > >>>>>>>>>>> "workspaces" : { > >>>>>>>>>>> "root" : { > >>>>>>>>>>> "location" : "/", > >>>>>>>>>>> "writable" : true, > >>>>>>>>>>> "defaultInputFormat" : null > >>>>>>>>>>> }, > >>>>>>>>>>> "tmp" : { > >>>>>>>>>>> "location" : "/tmp", > >>>>>>>>>>> "writable" : true, > >>>>>>>>>>> "defaultInputFormat" : null > >>>>>>>>>>> }, > >>>>>>>>>>> "drill" : { > >>>>>>>>>>> "location" : "/drill", > >>>>>>>>>>> "writable" : true, > >>>>>>>>>>> "defaultInputFormat" : "tsv" > >>>>>>>>>>> }, > >>>>>>>>>>> "drill2" : { > >>>>>>>>>>> "location" : "/drill", > >>>>>>>>>>> "writable" : true, > >>>>>>>>>>> "defaultInputFormat" : "csv" > >>>>>>>>>>> } > >>>>>>>>>>> }, > >>>>>>>>>>> "formats" : { > >>>>>>>>>>> "psv" : { > >>>>>>>>>>> "type" : "text", > >>>>>>>>>>> "extensions" : [ "tbl" ], > >>>>>>>>>>> "delimiter" : "|" > >>>>>>>>>>> }, > >>>>>>>>>>> "csv" : { > >>>>>>>>>>> "type" : "text", > >>>>>>>>>>> "extensions" : [ "csv" ], > >>>>>>>>>>> "delimiter" : "," > >>>>>>>>>>> }, > >>>>>>>>>>> "tsv" : { > >>>>>>>>>>> "type" : "text", > >>>>>>>>>>> "extensions" : [ "tsv" ], > >>>>>>>>>>> "delimiter" : "\t" > >>>>>>>>>>> }, > >>>>>>>>>>> "parquet" : { > >>>>>>>>>>> "type" : "parquet" > >>>>>>>>>>> }, > >>>>>>>>>>> "json" : { > >>>>>>>>>>> "type" : "json" > >>>>>>>>>>> }, > >>>>>>>>>>> "avro" : { > >>>>>>>>>>> "type" : "avro" > >>>>>>>>>>> }, > >>>>>>>>>>> "sequencefile" : { > >>>>>>>>>>> "type" : "sequencefile", > >>>>>>>>>>> "extensions" : [ "seq" ] > >>>>>>>>>>> }, > >>>>>>>>>>> "csvh" : { > >>>>>>>>>>> "type" : "text", > >>>>>>>>>>> "extensions" : [ "csvh" ], > >>>>>>>>>>> "extractHeader" : true, > >>>>>>>>>>> "delimiter" : "," > >>>>>>>>>>> } > >>>>>>>>>>> } > >>>>>>>>>>> }, > >>>>>>>>>>> "format" : { > >>>>>>>>>>> "type" : "parquet" > >>>>>>>>>>> }, > >>>>>>>>>>> "columns" : [ "`cvalue`" ], > >>>>>>>>>>> "selectionRoot" : "hdfs:// > >>>>>>>>>> ec2-XX-XX-XX-XX.compute-1.amazonaws.com:8020/drill/TAGS_US", > >>>>>>>>>>> "fileSet" : [ "/drill/TAGS_US/0_0_1.parquet", > >>>>>>>>>> "/drill/TAGS_US/0_0_0.parquet" ], > >>>>>>>>>>> "cost" : 4.1667342E7 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "hash-join", > >>>>>>>>>>> "@id" : 65539, > >>>>>>>>>>> "left" : 65541, > >>>>>>>>>>> "right" : 65540, > >>>>>>>>>>> "conditions" : [ { > >>>>>>>>>>> "relationship" : "EQUALS", > >>>>>>>>>>> "left" : "`cvalue`", > >>>>>>>>>>> "right" : "`tag_value`" > >>>>>>>>>>> } ], > >>>>>>>>>>> "joinType" : "INNER", > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 4.1667342E7 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "project", > >>>>>>>>>>> "@id" : 65538, > >>>>>>>>>>> "exprs" : [ { > >>>>>>>>>>> "ref" : "`$f0`", > >>>>>>>>>>> "expr" : "0" > >>>>>>>>>>> } ], > >>>>>>>>>>> "child" : 65539, > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 4.1667342E7 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "streaming-aggregate", > >>>>>>>>>>> "@id" : 65537, > >>>>>>>>>>> "child" : 65538, > >>>>>>>>>>> "keys" : [ ], > >>>>>>>>>>> "exprs" : [ { > >>>>>>>>>>> "ref" : "`EXPR$0`", > >>>>>>>>>>> "expr" : "count(1) " > >>>>>>>>>>> } ], > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 1.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "union-exchange", > >>>>>>>>>>> "@id" : 2, > >>>>>>>>>>> "child" : 65537, > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 1.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "streaming-aggregate", > >>>>>>>>>>> "@id" : 1, > >>>>>>>>>>> "child" : 2, > >>>>>>>>>>> "keys" : [ ], > >>>>>>>>>>> "exprs" : [ { > >>>>>>>>>>> "ref" : "`EXPR$0`", > >>>>>>>>>>> "expr" : "$sum0(`EXPR$0`) " > >>>>>>>>>>> } ], > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 1.0 > >>>>>>>>>>> }, { > >>>>>>>>>>> "pop" : "screen", > >>>>>>>>>>> "@id" : 0, > >>>>>>>>>>> "child" : 1, > >>>>>>>>>>> "initialAllocation" : 1000000, > >>>>>>>>>>> "maxAllocation" : 10000000000, > >>>>>>>>>>> "cost" : 1.0 > >>>>>>>>>>> } ] > >>>>>>>>>>> } | > >>>>>>>>>>> +------+------+ > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> On Mon, Jan 4, 2016 at 9:42 PM, Andries Engelbrecht < > >>>>>>>>>> [email protected]> wrote: > >>>>>>>>>>>> Perhaps check the data type of all the fields being used for > >>> the > >>>>>>>> join. > >>>>>>>>>>>> > >>>>>>>>>>>> Select cvalue, TYPEOF(cvalue) from hdfs...... limit 10 > >>>>>>>>>>>> > >>>>>>>>>>>> and similar for tag_value on redshift. > >>>>>>>>>>>> > >>>>>>>>>>>> You can then do a predicate to find records where the data > >>> type > >>>>>>> may > >>>>>>>> be > >>>>>>>>>> different. > >>>>>>>>>>>> where typeof(<field>) not like '<data type of field>' > >>>>>>>>>>>> > >>>>>>>>>>>> I believe there was a nice write up on they topic, but can't > >>> find > >>>>>>> it > >>>>>>>>>> now. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> --Andries > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> On Jan 3, 2016, at 8:45 PM, Rohit Kulkarni < > >>>>>>>>> [email protected]> > >>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Hello all, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I am sure if not all of you, but some of you must have seen > >>> this > >>>>>>>>>> error some > >>>>>>>>>>>>> time - > >>>>>>>>>>>>> > >>>>>>>>>>>>> *Error: SYSTEM ERROR: IllegalStateException: Already had POJO > >>>>>>> for > >>>>>>>> id > >>>>>>>>>>>>> (java.lang.Integer) > >>>>>>> [com.fasterxml.jackson.annotation.ObjectIdGenerator$IdKey@3372bbe8 > >>>>>>>>> ]* > >>>>>>>>>>>>> > >>>>>>>>>>>>> I am trying to do a join between Redshift (JDBC) and HDFS > >>> like > >>>>>>>>> this > >>>>>>>>>> - > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> *select count(*)from hdfs.drill.TAGS_US as aright join > >>>>>>>>>>>>> redshift.reports.public.us_tags as bon a.cvalue = > >>> b.tag_value;* > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> I don't see anything wrong in the query. The two individual > >>>>>>> tables > >>>>>>>>>> return > >>>>>>>>>>>>> proper data when fired a query separately. Is something > >>> missing > >>>>>>> or > >>>>>>>>> am > >>>>>>>>>> I > >>>>>>>>>>>>> doing something wrong? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Would very much appreciate your help! Thanks!! > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- > >>>>>>>>>>>>> Warm Regards, > >>>>>>>>>>>>> Rohit Kulkarni > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Warm Regards, > >>>>>>>>>>> Rohit Kulkarni > >>>>>>>>>>> Mo.: +91 89394 63593 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> Warm Regards, > >>>>>>>> Rohit Kulkarni > >>>>>>>> Mo.: +91 89394 63593 > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Warm Regards, > >>>>>>> Rohit Kulkarni > >>>>>>> Mo.: +91 89394 63593 > >> > >> > > >
