[
https://issues.apache.org/jira/browse/SPARK-41277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-41277:
---
Summary: Leverage shuffle key as bucketing properties (was: Save and
leverage shuffle key in
[
https://issues.apache.org/jira/browse/SPARK-41277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651877#comment-17651877
]
Ohad Raviv commented on SPARK-41277:
I managed to do some quick-and-dirty solution, just to be able
[
https://issues.apache.org/jira/browse/SPARK-41277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649300#comment-17649300
]
Ohad Raviv commented on SPARK-41277:
[~gurwls223] - can I please get your opinion here?
> Save and
[
https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646973#comment-17646973
]
Ohad Raviv commented on SPARK-41510:
ok.. after diving into the code I think I found what I was
[
https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646734#comment-17646734
]
Ohad Raviv commented on SPARK-41510:
the conda solution is more for a "static" packages.
the
[
https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646606#comment-17646606
]
Ohad Raviv edited comment on SPARK-41510 at 12/13/22 12:23 PM:
---
[
https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646606#comment-17646606
]
Ohad Raviv commented on SPARK-41510:
[~hvanhovell] - can you please refer that to someone?
>
[
https://issues.apache.org/jira/browse/SPARK-41510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-41510:
---
Description:
When working interactively with Spark through notebooks in various envs -
Ohad Raviv created SPARK-41510:
--
Summary: Support easy way for user defined PYTHONPATH in workers
Key: SPARK-41510
URL: https://issues.apache.org/jira/browse/SPARK-41510
Project: Spark
Issue
Ohad Raviv created SPARK-41277:
--
Summary: Save and leverage shuffle key in tblproperties
Key: SPARK-41277
URL: https://issues.apache.org/jira/browse/SPARK-41277
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-37752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17465972#comment-17465972
]
Ohad Raviv commented on SPARK-37752:
That is what I deduced. thanks for the answer!
> Python UDF
Ohad Raviv created SPARK-37752:
--
Summary: Python UDF fails when it should not get evaluated
Key: SPARK-37752
URL: https://issues.apache.org/jira/browse/SPARK-37752
Project: Spark
Issue Type:
Ohad Raviv created SPARK-34416:
--
Summary: Support avroSchemaUrl in addition to avroSchema
Key: SPARK-34416
URL: https://issues.apache.org/jira/browse/SPARK-34416
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-30739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17031437#comment-17031437
]
Ohad Raviv commented on SPARK-30739:
Closing as I realized this is actually the documented behaviour
[
https://issues.apache.org/jira/browse/SPARK-30739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv resolved SPARK-30739.
Resolution: Workaround
> unable to turn off Hadoop's trash feature
>
Ohad Raviv created SPARK-30739:
--
Summary: unable to turn off Hadoop's trash feature
Key: SPARK-30739
URL: https://issues.apache.org/jira/browse/SPARK-30739
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845559#comment-16845559
]
Ohad Raviv commented on SPARK-18748:
[~kelemen] - thanks for sharing.
> UDF multiple evaluations
[
https://issues.apache.org/jira/browse/SPARK-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv closed SPARK-16820.
--
resolved.
> Sparse - Sparse matrix multiplication
> -
>
>
[
https://issues.apache.org/jira/browse/SPARK-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845557#comment-16845557
]
Ohad Raviv commented on SPARK-16820:
this issue was resolved by SPARK-19368 and SPARK-16469.
>
[
https://issues.apache.org/jira/browse/SPARK-27707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839544#comment-16839544
]
Ohad Raviv commented on SPARK-27707:
[~cloud_fan] - any chance you can take a look?
> Performance
Ohad Raviv created SPARK-27707:
--
Summary: Performance issue using explode
Key: SPARK-27707
URL: https://issues.apache.org/jira/browse/SPARK-27707
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799163#comment-16799163
]
Ohad Raviv commented on SPARK-18748:
[~nimfadora] - thanks, we actually also ended up using this
Ohad Raviv created SPARK-26645:
--
Summary: CSV infer schema bug infers decimal(9,-1)
Key: SPARK-26645
URL: https://issues.apache.org/jira/browse/SPARK-26645
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16735492#comment-16735492
]
Ohad Raviv commented on SPARK-18748:
We're encountering this same problem once again with Spark
[
https://issues.apache.org/jira/browse/SPARK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-18748:
---
Affects Version/s: (was: 1.6.1)
2.3.0
2.4.0
> UDF
Ohad Raviv created SPARK-26070:
--
Summary: another implicit type coercion bug
Key: SPARK-26070
URL: https://issues.apache.org/jira/browse/SPARK-26070
Project: Spark
Issue Type: Bug
Ohad Raviv created SPARK-25963:
--
Summary: Optimize generate followed by window
Key: SPARK-25963
URL: https://issues.apache.org/jira/browse/SPARK-25963
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-25951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-25951:
---
Description:
we've noticed that sometimes a column rename causes extra shuffle:
{code:java}
val N =
[
https://issues.apache.org/jira/browse/SPARK-25951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-25951:
---
Description:
we've noticed that sometimes a column rename causes extra shuffle:
{code}
val N = 1 <<
Ohad Raviv created SPARK-25951:
--
Summary: Redundant shuffle if column is renamed
Key: SPARK-25951
URL: https://issues.apache.org/jira/browse/SPARK-25951
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625448#comment-16625448
]
Ohad Raviv edited comment on SPARK-23985 at 9/24/18 7:15 AM:
-
{quote}You
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-23985:
---
Description:
while predicate push down works with this query:
{code:sql}
select * from (
select
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625448#comment-16625448
]
Ohad Raviv commented on SPARK-23985:
{quote}You should move where("a>'1'") before withColumn:{quote}
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625436#comment-16625436
]
Ohad Raviv edited comment on SPARK-23985 at 9/24/18 7:07 AM:
-
the same is
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625436#comment-16625436
]
Ohad Raviv commented on SPARK-23985:
the same is true for Spark 2.4:
{code}
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625422#comment-16625422
]
Ohad Raviv commented on SPARK-23985:
you're right. that's very strange. looks like something got
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-24528:
---
Description:
https://issues.apache.org/jira/browse/SPARK-24528#Closely related to
SPARK-24410,
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1659#comment-1659
]
Ohad Raviv commented on SPARK-24528:
After digging a little bit in the code and Jira I understand
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523344#comment-16523344
]
Ohad Raviv commented on SPARK-24528:
Hi,
well it took me some time to get to it, but here are my
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511578#comment-16511578
]
Ohad Raviv commented on SPARK-24528:
I think the 2nd point better suits my usecase. i'll try to look
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16511483#comment-16511483
]
Ohad Raviv commented on SPARK-24528:
I understand the tradeoff, the question is how could we
[
https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509465#comment-16509465
]
Ohad Raviv commented on SPARK-24528:
[~cloud_fan], [~viirya] - Hi I found somewhat similar issue to
Ohad Raviv created SPARK-24528:
--
Summary: Missing optimization for Aggregations/Windowing on a
bucketed table
Key: SPARK-24528
URL: https://issues.apache.org/jira/browse/SPARK-24528
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-24410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493561#comment-16493561
]
Ohad Raviv commented on SPARK-24410:
[~sowen], [~cloud_fan] - could you please check if my
Ohad Raviv created SPARK-24410:
--
Summary: Missing optimization for Union on bucketed tables
Key: SPARK-24410
URL: https://issues.apache.org/jira/browse/SPARK-24410
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-23985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439114#comment-16439114
]
Ohad Raviv commented on SPARK-23985:
I see in the Optimizer that filters are getting pushed only if
Ohad Raviv created SPARK-23985:
--
Summary: predicate push down doesn't work with simple compound
partition spec
Key: SPARK-23985
URL: https://issues.apache.org/jira/browse/SPARK-23985
Project: Spark
Ohad Raviv created SPARK-22910:
--
Summary: Wrong results in Spark Job because failed to move to Trash
Key: SPARK-22910
URL: https://issues.apache.org/jira/browse/SPARK-22910
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16241887#comment-16241887
]
Ohad Raviv commented on SPARK-21657:
Hi,
I created a pull request:
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224420#comment-16224420
]
Ohad Raviv commented on SPARK-21657:
After some debugging, I think I understand the tricky part here.
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224374#comment-16224374
]
Ohad Raviv commented on SPARK-21657:
ok i found the relevant rule:
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224365#comment-16224365
]
Ohad Raviv commented on SPARK-21657:
After futher investigating I believe that my assesment is
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223946#comment-16223946
]
Ohad Raviv commented on SPARK-21657:
Sure,
the plan for
{code:java}
val df_exploded =
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16223902#comment-16223902
]
Ohad Raviv commented on SPARK-21657:
I Switched to toArray instead of toList in the above code and I
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222312#comment-16222312
]
Ohad Raviv edited comment on SPARK-21657 at 10/27/17 12:53 PM:
---
Hi,
Just
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16222312#comment-16222312
]
Ohad Raviv commented on SPARK-21657:
Hi,
Just ran a profiler for this code:
{code:scala}
val BASE =
[
https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16220450#comment-16220450
]
Ohad Raviv commented on SPARK-21657:
Hi,
Wanted to add that we're facing exactly the same issue. 6
[
https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15842581#comment-15842581
]
Ohad Raviv commented on SPARK-19368:
well, not with the same elegant code. the main problem is that
[
https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839673#comment-15839673
]
Ohad Raviv commented on SPARK-19368:
caused by..
> Very bad performance in
[
https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-19368:
---
Attachment: profiler snapshot.png
> Very bad performance in BlockMatrix.toIndexedRowMatrix()
>
[
https://issues.apache.org/jira/browse/SPARK-19368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv updated SPARK-19368:
---
Description:
In SPARK-12869, this function was optimized for the case of dense matrices
using
Ohad Raviv created SPARK-19368:
--
Summary: Very bad performance in BlockMatrix.toIndexedRowMatrix()
Key: SPARK-19368
URL: https://issues.apache.org/jira/browse/SPARK-19368
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-19230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15823644#comment-15823644
]
Ohad Raviv commented on SPARK-19230:
looks like it happens since SPARK-13827 - it just caused the
Ohad Raviv created SPARK-19230:
--
Summary: View creation in Derby gets SQLDataException because
definition gets very big
Key: SPARK-19230
URL: https://issues.apache.org/jira/browse/SPARK-19230
Project:
[
https://issues.apache.org/jira/browse/SPARK-18861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750729#comment-15750729
]
Ohad Raviv commented on SPARK-18861:
I think it was a problem at v2.0.0.
it is better to resolve it
Ohad Raviv created SPARK-18861:
--
Summary: Spark-SQL unconsistent behavior with "struct" expressions
Key: SPARK-18861
URL: https://issues.apache.org/jira/browse/SPARK-18861
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748264#comment-15748264
]
Ohad Raviv commented on SPARK-17662:
When I tried to use you suggestion I have encountered some
[
https://issues.apache.org/jira/browse/SPARK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731203#comment-15731203
]
Ohad Raviv commented on SPARK-18748:
accidently. I already closed the other ticket as duplicate
>
[
https://issues.apache.org/jira/browse/SPARK-18747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ohad Raviv closed SPARK-18747.
--
Resolution: Duplicate
> UDF multiple evaluations causes very poor performance
>
Ohad Raviv created SPARK-18748:
--
Summary: UDF multiple evaluations causes very poor performance
Key: SPARK-18748
URL: https://issues.apache.org/jira/browse/SPARK-18748
Project: Spark
Issue
Ohad Raviv created SPARK-18747:
--
Summary: UDF multiple evaluations causes very poor performance
Key: SPARK-18747
URL: https://issues.apache.org/jira/browse/SPARK-18747
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-17662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15672928#comment-15672928
]
Ohad Raviv commented on SPARK-17662:
you're right,
great solution!
I didn't know about the
Ohad Raviv created SPARK-17662:
--
Summary: Dedup UDAF
Key: SPARK-17662
URL: https://issues.apache.org/jira/browse/SPARK-17662
Project: Spark
Issue Type: New Feature
Reporter: Ohad
[
https://issues.apache.org/jira/browse/SPARK-16976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413461#comment-15413461
]
Ohad Raviv commented on SPARK-16976:
well,
it's not for MLlib but for GraphX and seems very much in
Ohad Raviv created SPARK-16976:
--
Summary: KCore implementation
Key: SPARK-16976
URL: https://issues.apache.org/jira/browse/SPARK-16976
Project: Spark
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/SPARK-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401039#comment-15401039
]
Ohad Raviv commented on SPARK-16820:
I will create a PR soon with a suggested fix, but tell me what
Ohad Raviv created SPARK-16821:
--
Summary: GraphX MCL algorithm
Key: SPARK-16821
URL: https://issues.apache.org/jira/browse/SPARK-16821
Project: Spark
Issue Type: New Feature
Ohad Raviv created SPARK-16820:
--
Summary: Sparse - Sparse matrix multiplication
Key: SPARK-16820
URL: https://issues.apache.org/jira/browse/SPARK-16820
Project: Spark
Issue Type: New Feature
Ohad Raviv created SPARK-16469:
--
Summary: Long running Driver task while multiplying big matrices
Key: SPARK-16469
URL: https://issues.apache.org/jira/browse/SPARK-16469
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199811#comment-15199811
]
Ohad Raviv commented on SPARK-13313:
Hi,
I am trying to use graphx's SCC and was very concerned with
80 matches
Mail list logo