[ 
https://issues.apache.org/jira/browse/SPARK-20794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-20794.
-------------------------------
    Resolution: Invalid

It's a question, so belongs on the mailing list. I think it's a DASHDB 
question. show is just picking from the first partition of the underlying data 
source.

> Spark show() command on dataset does not retrieve consistent rows from DASHDB 
> data source
> -----------------------------------------------------------------------------------------
>
>                 Key: SPARK-20794
>                 URL: https://issues.apache.org/jira/browse/SPARK-20794
>             Project: Spark
>          Issue Type: Question
>          Components: Spark Core
>    Affects Versions: 2.0.0
>            Reporter: Sahana HA
>            Priority: Minor
>
> When the user creates the dataframe from DASHDB data source (which is a 
> relational database) and executes df.show(5) it returns different result sets 
> or rows during each execution. We are aware that show(5) will pick the first 
> 5 rows from existing partition and hence it is not guaranteed to be 
> consistent across each execution. 
> However when we try the same show(5) command against S3 storage or 
> bluemixobject store (non-relational data source) we always get the same 
> result sets or rows in order, across each execution.
> We just wanted to confirm why the difference between DASHDB and other data 
> source like S3/Bluemixobjectstore ? Is the issue with spark or DASHDB alone ? 
> or is the inconsistent rows behavior is there for all relational data source ?
> Repro snippet:
> -- Load the data from dashdb
> val dashdb = 
> sqlContext.read.format("packageName").options(dashdbreadOptions).load
> -- execution #1
> dashdb.show(5)
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> |        PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER|   CITY|STATE|      
> COUNTRY|GENDER|AGE|MARITAL_STATUS|  PROFESSION|
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> |Personal Accessories|     Eyewear|           107861|Rutland|   VT|United 
> States|     F| 39|       Married|       Sales|
> |   Camping Equipment|    Lanterns|           189003| Sydney|  NSW|    
> Australia|     F| 20|        Single| Hospitality|
> |   Camping Equipment|Cooking Gear|           107863| Sydney|  NSW|    
> Australia|     F| 20|        Single| Hospitality|
> |Personal Accessories|     Eyewear|           189005|Villach|   NA|      
> Austria|     F| 37|       Married|Professional|
> |Personal Accessories|     Eyewear|           107865|Villach|   NA|      
> Austria|     F| 37|       Married|Professional|
> +--------------------+------------+-----------------+-------+-----+-------------+------+---+--------------+------------+
> only showing top 5 rows
> -- execution #2
> dashdb.show(5)
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> |        PRODUCT_LINE|PRODUCT_TYPE|CUST_ORDER_NUMBER|        CITY|STATE|      
>  COUNTRY|GENDER|AGE|MARITAL_STATUS| PROFESSION|
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> |Mountaineering Eq...|       Tools|           112835|  Portsmouth|   
> NA|United Kingdom|     M| 24|        Single|      Other|
> |   Camping Equipment|Cooking Gear|           193902|Jacksonville|   FL| 
> United States|     F| 22|        Single|Hospitality|
> |   Camping Equipment|       Packs|           112837|Jacksonville|   FL| 
> United States|     F| 22|        Single|Hospitality|
> |Mountaineering Eq...|        Rope|           193904|Jacksonville|   FL| 
> United States|     F| 31|       Married|      Other|
> |      Golf Equipment|     Putters|           112839|Jacksonville|   FL| 
> United States|     F| 31|       Married|      Other|
> +--------------------+------------+-----------------+------------+-----+--------------+------+---+--------------+-----------+
> only showing top 5 rows



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to