from:"Bode, Meikel, NMA\-CFD"

RE: Conda Python Env in K8S

2021-12-06 Thread Bode, Meikel, NMA-CFD

RK-33615 Best, Meikel From: Mich Talebzadeh Sent: Samstag, 4. Dezember 2021 18:36 To: Bode, Meikel, NMA-CFD Cc: dev ; user@spark.apache.org Subject: Re: Conda Python Env in K8S Hi Meikel In the past I tried with --py-files hdfs://$HDFS_HOST:$HDFS_PORT/minikube/codes/

RE: Conda Python Env in K8S

2021-12-03 Thread Bode, Meikel, NMA-CFD

these options exist and I want to understand what the issue is... Any hints on that? Best, Meikel From: Mich Talebzadeh Sent: Freitag, 3. Dezember 2021 13:27 To: Bode, Meikel, NMA-CFD Cc: dev ; user@spark.apache.org Subject: Re: Conda Python Env in K8S Build python packages into the docker

Conda Python Env in K8S

2021-12-03 Thread Bode, Meikel, NMA-CFD

Hello, I am trying to run spark jobs using Spark Kubernetes Operator. But when I try to bundle a conda python environment using the following resource description the python interpreter is only unpack to the driver and not to the executors. apiVersion: "sparkoperator.k8s.io/v1beta2" kind: Spark

RE: [issue] not able to add external libs to pyspark job while using spark-submit

2021-11-24 Thread Bode, Meikel, NMA-CFD

Can we add Python dependencies as we can do for mvn coordinates? So that we run sth like pip install or download from pypi index? From: Mich Talebzadeh Sent: Mittwoch, 24. November 2021 18:28 Cc: user@spark.apache.org Subject: Re: [issue] not able to add external libs to pyspark job while using

RE: HiveThrift2 ACID Transactions?

2021-11-11 Thread Bode, Meikel, NMA-CFD

27;true') Any help on these issues would be very appreciated! Many thanks, Meikel Bode From: Bode, Meikel, NMA-CFD Sent: Mittwoch, 10. November 2021 08:23 To: user ; dev Subject: HiveThrift2 ACID Transactions? Hi all, We want to use apply INSERTS, UPDATE, and DELETE operations on table

HiveThrift2 ACID Transactions?

2021-11-09 Thread Bode, Meikel, NMA-CFD

Hi all, We want to use apply INSERTS, UPDATE, and DELETE operations on tables based on parquet or ORC files served by thrift2. Actually its unclear whether we can enable them and where. At the moment, when executing UPDATE or DELETE operations those are getting blocked. Anyone out who uses ACI

3.1.2 Executor Initialization fails due to dep copy failure

2021-11-08 Thread Bode, Meikel, NMA-CFD

Hi all, I try to get Thrift2 on Spark 3.1.2 running on K8S with one executor for the moment. This works so far but it fails at executor side during initialization. The issue seems to be related to access restrictions on certain directories... But I am not sure. Please see errors marked in yellow

RE: [ANNOUNCE] Apache Spark 3.2.0

2021-10-19 Thread Bode, Meikel, NMA-CFD

Many thanks! 😊 From: Gengliang Wang Sent: Dienstag, 19. Oktober 2021 16:16 To: dev ; user Subject: [ANNOUNCE] Apache Spark 3.2.0 Hi all, Apache Spark 3.2.0 is the third release of the 3.x line. With tremendous contribution from the open-source community, this release managed to resolve in ex

RE: spark thrift server as hive on spark running on kubernetes, and more.

2021-09-09 Thread Bode, Meikel, NMA-CFD

Hi, thx. Great work. Will test it 😊 Best, Meikel Bode From: Kidong Lee Sent: Freitag, 10. September 2021 01:39 To: user@spark.apache.org Subject: spark thrift server as hive on spark running on kubernetes, and more. Hi, Recently, I have open-sourced a tool called DataRoaster(https://github.c

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

On EKS... From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 15:47 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Ok As I see it with PySpark even if it is submitted as cluster, it will be converted to client mode anyway Are you running

RE: K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

Hi Mich, All PySpark. Best, Meikel From: Mich Talebzadeh Sent: Donnerstag, 12. August 2021 13:41 To: Bode, Meikel, NMA-CFD Cc: user@spark.apache.org Subject: Re: K8S submit client vs. cluster Is this Spark or PySpark? [https://docs.google.com/uc?export=download&

K8S submit client vs. cluster

2021-08-12 Thread Bode, Meikel, NMA-CFD

Hi all, If we schedule a spark job on k8s, how are volume mappings handled? In client mode I would expect that drivers volumes have to mapped manually in the pod template. Executor volumes are attached dynamically based on submit parameters. Right...? I cluster mode I would expect that volumes

Parquet Metadata

2021-06-23 Thread Bode, Meikel, NMA-CFD

Hi folks, Maybe not the right audience but maybe you came along such an requirement. Is it possible to define a parquet schema, that contains technical column names and a list of translations for a certain column name into different languages? I give an example: Technical: "custnr" would transla

DF blank value fill

2021-05-21 Thread Bode, Meikel, NMA-CFD

Hi all, My df looks like follows: Situation: MainKey, SubKey, Val1, Val2, Val3, ... 1, 2, a, null, c 1, 2, null, null, c 1, 3, null, b, null 1, 3, a, null, c Desired outcome: 1, 2, a, b, c 1, 2, a, b, c 1, 3, a, b, c 1, 3, a, b, c How could I populate/synchronize empty cells of all records wi

RE: Thrift2 Server on Kubernetes?

2021-05-16 Thread Bode, Meikel, NMA-CFD

Hi Kidong Lee, Thank you for your email. Actually I came along your blog and it seems to be very complete. As you write, that its not easy to bring Spark Thrift2 to K8S and because you had to write your own wrapper, I have the impression that is not really officially supported, despite the fact

Thrift2 Server on Kubernetes?

2021-05-14 Thread Bode, Meikel, NMA-CFD

Hi all, We migrate to k8s and I wonder whether there are already "good practices" to run thrift2 on k8s? Best, Meikel

Broadcast Variable

2021-05-03 Thread Bode, Meikel, NMA-CFD

Hi all, when broadcasting a large dict containing several million entries to executors what exactly happens when calling bc_var.value within a UDF like: .. d = bc_var.value .. Does d receives a copy of the dict inside value or is this handled like a pointer? Thanks, Meikel

RE: Recursive Queries or Recursive UDF?

2021-05-01 Thread Bode, Meikel, NMA-CFD

e root by indicating: child, lvl-0-parent inquiry1, null inquiry2, null order3, null Actually that’s what I realized with my recursive UDF I put into the initial post. Thank you for any hints on that issue! Any hints on the UDF solution are also very welcome: Thx and best, Meikel From: Bod

Recursive Queries or Recursive UDF?

2021-04-30 Thread Bode, Meikel, NMA-CFD

Hi all, I implemented a recursive UDF, that tries to find a document number in a long list of predecessor documents. This can be a multi-level hierarchy: C is successor of B is successor of A (but many more levels are possible) As input to that UDF I prepare a dict that contains the complete doc

AW: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Bode, Meikel, NMA-CFD

Congrats! Von: Hyukjin Kwon Gesendet: Mittwoch, 3. März 2021 02:41 An: user @spark ; dev Betreff: [ANNOUNCE] Announcing Apache Spark 3.1.1 We are excited to announce Spark 3.1.1 today. Apache Spark 3.1.1 is the second release of the 3.x line. This release adds Python type annotations and Pytho

AW: Issue after change to 3.0.2

2021-02-26 Thread Bode, Meikel, NMA-CFD

Hi Sean. You are right. So we are using docker images for our spark cluster. The generation of the worker image did no succeed and therefore the old 3.0.1 image was still in use. Thanks, Best, Meikel Von: Sean Owen Gesendet: Freitag, 26. Februar 2021 10:29 An: Bode, Meikel, NMA-CFD Cc: user

Issue after change to 3.0.2

2021-02-26 Thread Bode, Meikel, NMA-CFD

Hi All, After changing to 3.0.2 I face the following issue. Thanks for any hint on that issue. Best, Meikel df = self.spark.read.json(path_in) File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 300, in json File "/opt/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_

Strange behavior with "bigger" JSON file

2021-01-28 Thread Bode, Meikel, NMA-CFD

Hi all, I process a lot of JSON files of different sizes. All files share the same overall structure. I have no issue with files of sizes around 150-300MB. Another file of around 530MB now causes errors when I apply selectExpr on the resulting DF after reading the file. AnalysisException: canno

RE: Conda Python Env in K8S

RE: Conda Python Env in K8S

Conda Python Env in K8S

RE: [issue] not able to add external libs to pyspark job while using spark-submit

RE: HiveThrift2 ACID Transactions?

HiveThrift2 ACID Transactions?

3.1.2 Executor Initialization fails due to dep copy failure

RE: [ANNOUNCE] Apache Spark 3.2.0

RE: spark thrift server as hive on spark running on kubernetes, and more.

RE: K8S submit client vs. cluster

RE: K8S submit client vs. cluster

K8S submit client vs. cluster

Parquet Metadata

DF blank value fill

RE: Thrift2 Server on Kubernetes?

Thrift2 Server on Kubernetes?

Broadcast Variable

RE: Recursive Queries or Recursive UDF?

Recursive Queries or Recursive UDF?

AW: [ANNOUNCE] Announcing Apache Spark 3.1.1

AW: Issue after change to 3.0.2

Issue after change to 3.0.2

Strange behavior with "bigger" JSON file

23 matches

Site Navigation

Mail list logo

Footer information