luky777 commented on issue #9844:
URL: https://github.com/apache/hudi/issues/9844#issuecomment-1770200400
So below I simplified code, just to run hud procedure and this is my
results: Error with below code I get is: "It's not a Hudi table". If I change
line:
spark_df_commits = spark.sql("call show_commits(adpdb.hudi_stage_jira, 5)")
to:
spark_df_commits = spark.sql("call show_commits('adpdb.hudi_stage_jira', 5)")
Then I get different error: ParseException:
== SQL ==
call show_commits('adpdb.hudi_stage_jira', 5)
^^^
------------------------------------------------------------------
```
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.sql.functions import concat_ws, col, split, size, lit
from pyspark.sql.types import StringType, BooleanType, DateType
import pyspark.sql.functions as F
args = getResolvedOptions(sys.argv, ["JOB_NAME"])
spark =
(SparkSession.builder.config('spark.serializer','org.apache.spark.serializer.KryoSerializer')
\
.config('spark.sql.hive.convertMetastoreParquet','false') \
.config('spark.sql.catalog.spark_catalog',
'org.apache.spark.sql.hudi.catalog.HoodieCatalog') \
.config('spark.sql.extensions','org.apache.spark.sql.hudi.HoodieSparkSessionExtension')
\
.config('spark.sql.legacy.pathOptionBehavior.enabled',
'true').getOrCreate())
sc = spark.sparkContext
glueContext = GlueContext(sc)
job = Job(glueContext)
job.init(args["JOB_NAME"], args)
# Execute the query and show the results
spark_df_commits = spark.sql("call show_commits(adpdb.hudi_stage_jira, 5)")
spark_df_commits.show()
job.commit()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]