[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622932#comment-16622932 ] Reynold Xin commented on SPARK-23715: - we can't fail queries in 2.x.   > from_utc_timestamp

[jira] [Comment Edited] (SPARK-10816) EventTime based sessionization

2018-09-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622807#comment-16622807 ] Reynold Xin edited comment on SPARK-10816 at 9/20/18 10:30 PM: --- I will let

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622807#comment-16622807 ] Reynold Xin commented on SPARK-10816: - I will let [~marmbrus] chime in ...  As the initial person

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622802#comment-16622802 ] Reynold Xin commented on SPARK-23715: - Great discussions. Since you don't mind, let's revert the

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16622599#comment-16622599 ] Reynold Xin commented on SPARK-10816: - So I've actually run this idea by a lot of users, and almost

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621405#comment-16621405 ] Reynold Xin commented on SPARK-23715: - Also I reject the notion that the old behavior was

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621395#comment-16621395 ] Reynold Xin commented on SPARK-23715: - [~bersprockets] i think we should revert the change while we

[jira] [Resolved] (SPARK-19724) create a managed table with an existed default location should throw an exception

2018-09-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-19724. - Resolution: Fixed Assignee: Gengliang Wang Fix Version/s: 2.4.0 > create a

[jira] [Resolved] (SPARK-24626) Parallelize size calculation in Analyze Table command

2018-09-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-24626. - Resolution: Fixed Assignee: Reynold Xin Fix Version/s: 2.4.0 > Parallelize size

[jira] [Updated] (SPARK-24626) Parallelize size calculation in Analyze Table command

2018-09-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-24626: Summary: Parallelize size calculation in Analyze Table command (was: Improve Analyze Table

[jira] [Commented] (SPARK-19489) Stable serialization format for external & native code integration

2018-09-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610810#comment-16610810 ] Reynold Xin commented on SPARK-19489: - We can close this now. > Stable serialization format for

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609892#comment-16609892 ] Reynold Xin commented on SPARK-25331: - Yes I would rely on idempotency here. Retries upon failure +

[jira] [Commented] (SPARK-23580) Interpreted mode fallback should be implemented for all expressions & projections

2018-09-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609787#comment-16609787 ] Reynold Xin commented on SPARK-23580: - 90% or 100%?   > Interpreted mode fallback should be

[jira] [Commented] (SPARK-25196) Analyze column statistics in cached query

2018-08-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589050#comment-16589050 ] Reynold Xin commented on SPARK-25196: - Can we rework the interface so the two are not separate code

[jira] [Assigned] (SPARK-25127) DataSourceV2: Remove SupportsPushDownCatalystFilters

2018-08-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-25127: --- Assignee: Reynold Xin > DataSourceV2: Remove SupportsPushDownCatalystFilters >

[jira] [Commented] (SPARK-24924) Add mapping for built-in Avro data source

2018-08-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16568454#comment-16568454 ] Reynold Xin commented on SPARK-24924: - I like the improved error message (I didn't read the earlier

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2018-08-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16567383#comment-16567383 ] Reynold Xin commented on SPARK-14220: - This is awesome! Congrats!   > Build and test Spark against

[jira] [Assigned] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-24982: --- Assignee: Reynold Xin > UDAF resolution should not throw java.lang.AssertionError >

[jira] [Updated] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-24982: Description: See udaf.sql.out:   {code:java} – !query 3 SELECT default.myDoubleAvg(int_col1, 3)

[jira] [Created] (SPARK-24982) UDAF resolution should not throw java.lang.AssertionError

2018-07-31 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-24982: --- Summary: UDAF resolution should not throw java.lang.AssertionError Key: SPARK-24982 URL: https://issues.apache.org/jira/browse/SPARK-24982 Project: Spark

[jira] [Created] (SPARK-24951) Table valued functions should throw AnalysisException instead of IllegalArgumentException

2018-07-27 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-24951: --- Summary: Table valued functions should throw AnalysisException instead of IllegalArgumentException Key: SPARK-24951 URL: https://issues.apache.org/jira/browse/SPARK-24951

[jira] [Commented] (SPARK-24874) Allow hybrid of both barrier tasks and regular tasks in a stage

2018-07-25 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556241#comment-16556241 ] Reynold Xin commented on SPARK-24874: - Do we really need this? Seems like an uncommon use case.  

[jira] [Updated] (SPARK-24865) Remove AnalysisBarrier

2018-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-24865: Affects Version/s: 2.3.0 > Remove AnalysisBarrier > -- > >

[jira] [Updated] (SPARK-24865) Remove AnalysisBarrier

2018-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-24865: Description: AnalysisBarrier was introduced in SPARK-20392 to improve analysis speed (don't

[jira] [Created] (SPARK-24865) Remove AnalysisBarrier

2018-07-19 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-24865: --- Summary: Remove AnalysisBarrier Key: SPARK-24865 URL: https://issues.apache.org/jira/browse/SPARK-24865 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-15689) Data source API v2

2018-07-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549724#comment-16549724 ] Reynold Xin commented on SPARK-15689: - Is there an umbrella ticket on improving the dsv2 api? If

[jira] [Comment Edited] (SPARK-23901) Data Masking Functions

2018-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545446#comment-16545446 ] Reynold Xin edited comment on SPARK-23901 at 7/16/18 4:31 PM: -- I actually

[jira] [Commented] (SPARK-23901) Data Masking Functions

2018-07-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545446#comment-16545446 ] Reynold Xin commented on SPARK-23901: - I actually feel pretty strongly we should remove them.   >

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2018-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540468#comment-16540468 ] Reynold Xin commented on SPARK-20202: - Yea you can try and see how difficult it is.   > Remove

[jira] [Commented] (SPARK-20202) Remove references to org.spark-project.hive

2018-07-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16540454#comment-16540454 ] Reynold Xin commented on SPARK-20202: - If you want to try and put together a PR that actually does

[jira] [Commented] (SPARK-24579) SPIP: Standardize Optimized Data Exchange between Spark and DL/AI frameworks

2018-07-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16530643#comment-16530643 ] Reynold Xin commented on SPARK-24579: - I can't either.   > SPIP: Standardize Optimized Data

[jira] [Commented] (SPARK-24642) Add a function which infers schema from a JSON column

2018-06-27 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16525548#comment-16525548 ] Reynold Xin commented on SPARK-24642: - [~maxgekk] I think this is too complicated and unpredictable.

[jira] [Commented] (SPARK-24642) Add a function which infers schema from a JSON column

2018-06-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524590#comment-16524590 ] Reynold Xin commented on SPARK-24642: - Do we want this as an aggregate function? I'm thinking it's

[jira] [Resolved] (SPARK-19480) Higher order functions in SQL

2018-06-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-19480. - Resolution: Duplicate Target Version/s: (was: 2.4.0, 3.0.0) > Higher order

[jira] [Commented] (SPARK-23901) Data Masking Functions

2018-06-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16514667#comment-16514667 ] Reynold Xin commented on SPARK-23901: - Why are we adding 1200 lines of code for some functions that

[jira] [Commented] (SPARK-24359) SPIP: ML Pipelines in R

2018-06-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500639#comment-16500639 ] Reynold Xin commented on SPARK-24359: - Why would a separate repo lead to faster iteration? What's

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498493#comment-16498493 ] Reynold Xin edited comment on SPARK-24374 at 6/1/18 8:05 PM: - Just thought

[jira] [Commented] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498493#comment-16498493 ] Reynold Xin commented on SPARK-24374: - Just thought of this — Continuous Processing really requires

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498493#comment-16498493 ] Reynold Xin edited comment on SPARK-24374 at 6/1/18 8:02 PM: - Just thought

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498488#comment-16498488 ] Reynold Xin edited comment on SPARK-24374 at 6/1/18 7:58 PM: - That breaks

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498488#comment-16498488 ] Reynold Xin edited comment on SPARK-24374 at 6/1/18 7:57 PM: - That breaks

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498488#comment-16498488 ] Reynold Xin edited comment on SPARK-24374 at 6/1/18 7:57 PM: - That breaks

[jira] [Commented] (SPARK-24374) SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16498488#comment-16498488 ] Reynold Xin commented on SPARK-24374: - That breaks end to end FT right?   > SPIP: Support Barrier

[jira] [Commented] (SPARK-24442) Add configuration parameter to adjust the numbers of records and the charters per row before truncation when a user runs.show()

2018-05-31 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497087#comment-16497087 ] Reynold Xin commented on SPARK-24442: - Actually a pretty good idea. I've often wished there's a way

[jira] [Commented] (SPARK-23074) Dataframe-ified zipwithindex

2018-05-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16460116#comment-16460116 ] Reynold Xin commented on SPARK-23074: - It should give you the same ordering - that's the

[jira] [Commented] (SPARK-23074) Dataframe-ified zipwithindex

2018-05-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16459979#comment-16459979 ] Reynold Xin commented on SPARK-23074: - Can your problem be solved by monotonically_increasing_id,

[jira] [Commented] (SPARK-24010) Select from table needs read access on DB folder when storage based auth is enabled

2018-04-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16442802#comment-16442802 ] Reynold Xin commented on SPARK-24010: - It's better to throw an error message to tell the user the db

[jira] [Commented] (SPARK-23964) why does Spillable wait for 32 elements?

2018-04-11 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434460#comment-16434460 ] Reynold Xin commented on SPARK-23964: - Was it trying to reduce overhead?   > why does Spillable

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423479#comment-16423479 ] Reynold Xin commented on SPARK-23852: - Does turning the flag parquet.filter.stats.enabled off also

[jira] [Commented] (SPARK-21363) Prevent column name duplication in temporary view

2018-04-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423048#comment-16423048 ] Reynold Xin commented on SPARK-21363: - How can user drop the fields or rename them after joins?   >

[jira] [Commented] (SPARK-23772) Provide an option to ignore column of all null values or empty map/array during JSON schema inference

2018-03-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412007#comment-16412007 ] Reynold Xin commented on SPARK-23772: - This is a good change to do!   > Provide an option to ignore

[jira] [Commented] (SPARK-23325) DataSourceV2 readers should always produce InternalRow.

2018-03-06 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389135#comment-16389135 ] Reynold Xin commented on SPARK-23325: - Yes perhaps we should do that. It is a lot more work than what

[jira] [Commented] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356536#comment-16356536 ] Reynold Xin commented on SPARK-20090: - Do you mind doing it? Thanks. > Add StructType.fieldNames

[jira] [Commented] (SPARK-21658) Adds the default None for value in na.replace in PySpark to match

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351260#comment-16351260 ] Reynold Xin commented on SPARK-21658: - I'd revert this one first. I'd even consider the other one a

[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351186#comment-16351186 ] Reynold Xin commented on SPARK-23081: - Scala and Python actually. Sorry I was only commmenting on

[jira] [Commented] (SPARK-20425) Support an extended display mode to print a column data per line

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350876#comment-16350876 ] Reynold Xin commented on SPARK-20425: - Hey so I don't think we should be doing multiple boolean

[jira] [Commented] (SPARK-20090) Add StructType.fieldNames to Python API

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350861#comment-16350861 ] Reynold Xin commented on SPARK-20090: - Why would we deprecate this? I'd probably add names to Scala

[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350852#comment-16350852 ] Reynold Xin commented on SPARK-23081: - Sorry why are we adding things like this? I see the value of

[jira] [Commented] (SPARK-21658) Adds the default None for value in na.replace in PySpark to match

2018-02-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350848#comment-16350848 ] Reynold Xin commented on SPARK-21658: - Sorry but I object to this change. Why would we put null as

[jira] [Commented] (SPARK-23173) from_json can produce nulls for fields which are marked as non-nullable

2018-01-25 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339794#comment-16339794 ] Reynold Xin commented on SPARK-23173: - Yea I agree with you Herman. On Sun, Jan 21, 2018 at 5:44 PM

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2018-01-16 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327453#comment-16327453 ] Reynold Xin commented on SPARK-21274: - Can't we rewrite this as two aggregates and a join?   >

[jira] [Commented] (SPARK-23083) Adding Kubernetes as an option to https://spark.apache.org/

2018-01-15 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326619#comment-16326619 ] Reynold Xin commented on SPARK-23083: - Here's the website repo:

[jira] [Commented] (SPARK-22853) K8s or Kubernetes?

2018-01-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16319147#comment-16319147 ] Reynold Xin commented on SPARK-22853: - Thanks for looking into this. > K8s or Kubernetes? >

[jira] [Resolved] (SPARK-22853) K8s or Kubernetes?

2018-01-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22853. - Resolution: Fixed Assignee: Anirudh Ramanathan > K8s or Kubernetes? > -- >

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16313419#comment-16313419 ] Reynold Xin commented on SPARK-22947: - Yes you convinced me. This seems too annoying to express in

[jira] [Commented] (SPARK-7721) Generate test coverage report from Python

2018-01-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16312398#comment-16312398 ] Reynold Xin commented on SPARK-7721: I think it's fine even if you don't preserve the history forever

[jira] [Commented] (SPARK-7721) Generate test coverage report from Python

2018-01-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311731#comment-16311731 ] Reynold Xin commented on SPARK-7721: We can add it first but in my experience this will only be used

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311673#comment-16311673 ] Reynold Xin commented on SPARK-22947: - Basically we should separate the logical plan from the

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-04 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16311667#comment-16311667 ] Reynold Xin commented on SPARK-22947: - So this is just a hint that there is only one matching tuple

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-01-03 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310388#comment-16310388 ] Reynold Xin commented on SPARK-22947: - Li, Why are these not just normal inner joins with conditions

[jira] [Resolved] (SPARK-22648) Documentation for Kubernetes Scheduler Backend

2017-12-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22648. - Resolution: Fixed Assignee: Anirudh Ramanathan Fix Version/s: 2.3.0 >

[jira] [Commented] (SPARK-22853) K8s or Kubernetes?

2017-12-21 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300363#comment-16300363 ] Reynold Xin commented on SPARK-22853: - I think most of your questions can be answered by Google

[jira] [Commented] (SPARK-22853) K8s or Kubernetes?

2017-12-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299632#comment-16299632 ] Reynold Xin commented on SPARK-22853: - I personally prefer "k8s", since it is way simpler, but I

[jira] [Commented] (SPARK-22825) Incorrect results of Casting Array to String

2017-12-18 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16296305#comment-16296305 ] Reynold Xin commented on SPARK-22825: - [~maropu] you should :) > Incorrect results of Casting Array

[jira] [Commented] (SPARK-7721) Generate test coverage report from Python

2017-12-13 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290134#comment-16290134 ] Reynold Xin commented on SPARK-7721: We definitely don't need to do it in one-go, but with all the

[jira] [Created] (SPARK-22779) ConfigEntry's default value should actually be a value

2017-12-13 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-22779: --- Summary: ConfigEntry's default value should actually be a value Key: SPARK-22779 URL: https://issues.apache.org/jira/browse/SPARK-22779 Project: Spark Issue

[jira] [Created] (SPARK-22710) ConfigBuilder.fallbackConf doesn't trigger onCreate function

2017-12-05 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-22710: --- Summary: ConfigBuilder.fallbackConf doesn't trigger onCreate function Key: SPARK-22710 URL: https://issues.apache.org/jira/browse/SPARK-22710 Project: Spark

[jira] [Commented] (SPARK-7721) Generate test coverage report from Python

2017-11-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266358#comment-16266358 ] Reynold Xin commented on SPARK-7721: This is really cool. I took a look but it looks like doctests are

[jira] [Commented] (SPARK-21866) SPIP: Image support in Spark

2017-11-22 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263589#comment-16263589 ] Reynold Xin commented on SPARK-21866: - Why not just declare an image function that loads the image

[jira] [Resolved] (SPARK-22369) PySpark: Document methods of spark.catalog interface

2017-11-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22369. - Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.3.0 > PySpark:

[jira] [Resolved] (SPARK-22408) RelationalGroupedDataset's distinct pivot value calculation launches unnecessary stages

2017-11-02 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22408. - Resolution: Fixed Assignee: Patrick Woody Fix Version/s: 2.3.0 >

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16234428#comment-16234428 ] Reynold Xin commented on SPARK-20928: - Maybe we can add some information metadata (like a string to

[jira] [Commented] (SPARK-15689) Data source API v2

2017-10-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224122#comment-16224122 ] Reynold Xin commented on SPARK-15689: - Why not put all of them as subtasks here? Also

[jira] [Commented] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220956#comment-16220956 ] Reynold Xin commented on SPARK-20928: - That doesn't yet exist does it? How would that work for

[jira] [Commented] (SPARK-21043) Add unionByName API to Dataset

2017-10-24 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217415#comment-16217415 ] Reynold Xin commented on SPARK-21043: - Because some people expect union by position too. > Add

[jira] [Updated] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20928: Attachment: Continuous Processing in Structured Streaming Design Sketch.pdf > SPIP: Continuous

[jira] [Updated] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20928: Attachment: (was: Continuous Processing in Structured Streaming Design Sketch.pdf) > SPIP:

[jira] [Updated] (SPARK-20928) SPIP: Continuous Processing Mode for Structured Streaming

2017-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20928: Summary: SPIP: Continuous Processing Mode for Structured Streaming (was: Continuous Processing

[jira] [Updated] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20928: Attachment: Continuous Processing in Structured Streaming Design Sketch.pdf > Continuous

[jira] [Deleted] (SPARK-22325) SPARK_TESTING env variable breaking non-spark builds on amplab jenkins

2017-10-20 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin deleted SPARK-22325: > SPARK_TESTING env variable breaking non-spark builds on amplab jenkins >

[jira] [Comment Edited] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202924#comment-16202924 ] Reynold Xin edited comment on SPARK-20928 at 10/13/17 1:40 AM: --- OK got it -

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202924#comment-16202924 ] Reynold Xin commented on SPARK-20928: - OK got it - you are basically saying if we can send the offset

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202906#comment-16202906 ] Reynold Xin commented on SPARK-20928: - Isn't there an issue with the overhead of tracking in the

[jira] [Commented] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16202621#comment-16202621 ] Reynold Xin commented on SPARK-20928: - [~c...@koeninger.org] can you write down your thoughts on how

[jira] [Updated] (SPARK-20928) Continuous Processing Mode for Structured Streaming

2017-10-12 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20928: Labels: SPIP (was: ) > Continuous Processing Mode for Structured Streaming >

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198301#comment-16198301 ] Reynold Xin commented on SPARK-22231: - For drop columns - why not just df.drop("items.b")? >

[jira] [Commented] (SPARK-22231) Support of map, filter, withColumn, dropColumn in nested list of structures

2017-10-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198306#comment-16198306 ] Reynold Xin commented on SPARK-22231: - Can you say more? I can't think of a case in which you'd want

[jira] [Assigned] (SPARK-20396) groupBy().apply() with pandas udf in pyspark

2017-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin reassigned SPARK-20396: --- Assignee: Li Jin > groupBy().apply() with pandas udf in pyspark >

[jira] [Updated] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-21190: Issue Type: Sub-task (was: New Feature) Parent: SPARK-22216 > SPIP: Vectorized UDFs in

[jira] [Updated] (SPARK-20396) groupBy().apply() with pandas udf in pyspark

2017-10-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-20396: Issue Type: Sub-task (was: New Feature) Parent: SPARK-22216 > groupBy().apply() with

<    1   2   3   4   5   6   7   8   9   10   >