[jira] [Updated] (SPARK-12763) Spark gets stuck executing SSB query
[ https://issues.apache.org/jira/browse/SPARK-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-12763: - Labels: bulk-closed (was: ) > Spark gets stuck executing SSB query > > > Key: SPARK-12763 > URL: https://issues.apache.org/jira/browse/SPARK-12763 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 > Environment: Standalone cluster >Reporter: Vadim Tkachenko >Priority: Major > Labels: bulk-closed > Attachments: Spark shell - Details for Stage 5 (Attempt 0).pdf > > > I am trying to emulate SSB load. Data generated with > https://github.com/Percona-Lab/ssb-dbgen > generated size is with 1000 scale factor and converted to parquet format. > Now there is a following script > val pLineOrder = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/lineorder").cache() > val pDate = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/date").cache() > val pPart = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/part").cache() > val pSupplier = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/supplier").cache() > val pCustomer = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/customer").cache() > pLineOrder.registerTempTable("lineorder") > pDate.registerTempTable("date") > pPart.registerTempTable("part") > pSupplier.registerTempTable("supplier") > pCustomer.registerTempTable("customer") > query > val sql41=sqlContext.sql("select D_YEAR, C_NATION,sum(LO_REVENUE - > LO_SUPPLYCOST) as profit from date, customer, supplier, part, lineorder > where LO_CUSTKEY = C_CUSTKEYand LO_SUPPKEY = S_SUPPKEYand > LO_PARTKEY = P_PARTKEY and LO_ORDERDATE = D_DATEKEYand C_REGION = > 'AMERICA'and S_REGION = 'AMERICA'and (P_MFGR = 'MFGR#1' or P_MFGR = > 'MFGR#2') group by D_YEAR, C_NATION order by D_YEAR, C_NATION") > and > sql41.show() > get stuck, at some point there is no progress and server is fully idle, but > Job is staying at the same stage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12763) Spark gets stuck executing SSB query
[ https://issues.apache.org/jira/browse/SPARK-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-12763: -- Component/s: SQL > Spark gets stuck executing SSB query > > > Key: SPARK-12763 > URL: https://issues.apache.org/jira/browse/SPARK-12763 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 > Environment: Standalone cluster >Reporter: Vadim Tkachenko > Attachments: Spark shell - Details for Stage 5 (Attempt 0).pdf > > > I am trying to emulate SSB load. Data generated with > https://github.com/Percona-Lab/ssb-dbgen > generated size is with 1000 scale factor and converted to parquet format. > Now there is a following script > val pLineOrder = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/lineorder").cache() > val pDate = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/date").cache() > val pPart = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/part").cache() > val pSupplier = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/supplier").cache() > val pCustomer = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/customer").cache() > pLineOrder.registerTempTable("lineorder") > pDate.registerTempTable("date") > pPart.registerTempTable("part") > pSupplier.registerTempTable("supplier") > pCustomer.registerTempTable("customer") > query > val sql41=sqlContext.sql("select D_YEAR, C_NATION,sum(LO_REVENUE - > LO_SUPPLYCOST) as profit from date, customer, supplier, part, lineorder > where LO_CUSTKEY = C_CUSTKEYand LO_SUPPKEY = S_SUPPKEYand > LO_PARTKEY = P_PARTKEY and LO_ORDERDATE = D_DATEKEYand C_REGION = > 'AMERICA'and S_REGION = 'AMERICA'and (P_MFGR = 'MFGR#1' or P_MFGR = > 'MFGR#2') group by D_YEAR, C_NATION order by D_YEAR, C_NATION") > and > sql41.show() > get stuck, at some point there is no progress and server is fully idle, but > Job is staying at the same stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-12763) Spark gets stuck executing SSB query
[ https://issues.apache.org/jira/browse/SPARK-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vadim Tkachenko updated SPARK-12763: Attachment: Spark shell - Details for Stage 5 (Attempt 0).pdf Details on the stalled stage > Spark gets stuck executing SSB query > > > Key: SPARK-12763 > URL: https://issues.apache.org/jira/browse/SPARK-12763 > Project: Spark > Issue Type: Bug >Affects Versions: 1.6.0 > Environment: Standalone cluster >Reporter: Vadim Tkachenko > Attachments: Spark shell - Details for Stage 5 (Attempt 0).pdf > > > I am trying to emulate SSB load. Data generated with > https://github.com/Percona-Lab/ssb-dbgen > generated size is with 1000 scale factor and converted to parquet format. > Now there is a following script > val pLineOrder = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/lineorder").cache() > val pDate = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/date").cache() > val pPart = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/part").cache() > val pSupplier = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/supplier").cache() > val pCustomer = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/customer").cache() > pLineOrder.registerTempTable("lineorder") > pDate.registerTempTable("date") > pPart.registerTempTable("part") > pSupplier.registerTempTable("supplier") > pCustomer.registerTempTable("customer") > query > val sql41=sqlContext.sql("select D_YEAR, C_NATION,sum(LO_REVENUE - > LO_SUPPLYCOST) as profit from date, customer, supplier, part, lineorder > where LO_CUSTKEY = C_CUSTKEYand LO_SUPPKEY = S_SUPPKEYand > LO_PARTKEY = P_PARTKEY and LO_ORDERDATE = D_DATEKEYand C_REGION = > 'AMERICA'and S_REGION = 'AMERICA'and (P_MFGR = 'MFGR#1' or P_MFGR = > 'MFGR#2') group by D_YEAR, C_NATION order by D_YEAR, C_NATION") > and > sql41.show() > get stuck, at some point there is no progress and server is fully idle, but > Job is staying at the same stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org