Re: spark udf can not change a json string to a map

Ted Yu Wed, 18 May 2016 06:48:52 -0700

Please take a look at JavaUtils#mapAsSerializableJavaMap

FYI


On Mon, May 16, 2016 at 3:24 AM, 喜之郎 <251922...@qq.com> wrote:

>
> hi, Ted.
> I found a built-in function called str_to_map, which can transform string
> to map.
> But it can not meet my need.
>
> Because my string is maybe a map with a array nested in its value.
> for example, map<string, Array<string>>.
> I think it can not work fine in my situation.
>
> Cheers
>
> ------------------ 原始邮件 ------------------
> *发件人:* "喜之郎";<251922...@qq.com>;
> *发送时间:* 2016年5月16日(星期一) 上午10:00
> *收件人:* "Ted Yu"<yuzhih...@gmail.com>;
> *抄送:* "user"<user@spark.apache.org>;
> *主题:* 回复： spark udf can not change a json string to a map
>
> this is my usecase:
>    Another system upload csv files to my system. In csv files, there are
> complicated data types such as map. In order to express complicated data
> types and ordinary string having special characters， we put urlencoded
> string in csv files.  So we use urlencoded json string to express
> map,string and array.
>
> second stage:
>   load csv files to spark text table.
> ###############
> CREATE TABLE `a_text`(
>   parameters  string
> )
> load data inpath 'XXX' into table a_text;
> #############
> Third stage:
>  insert into spark parquet table select from text table. In order to use
> advantage of complicated data types, we use udf to transform a json
> string to map , and put map to table.
>
> CREATE TABLE `a_parquet`(
>   parameters   map<string,string>
> )
>
> insert into a_parquet select UDF(parameters ) from a_text;
>
> So do you have any suggestions?
>
>
>
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Ted Yu";<yuzhih...@gmail.com>;
> *发送时间:* 2016年5月16日(星期一) 凌晨0:44
> *收件人:* "喜之郎"<251922...@qq.com>;
> *抄送:* "user"<user@spark.apache.org>;
> *主题:* Re: spark udf can not change a json string to a map
>
> Can you let us know more about your use case ?
>
> I wonder if you can structure your udf by not returning Map.
>
> Cheers
>
> On Sun, May 15, 2016 at 9:18 AM, 喜之郎 <251922...@qq.com> wrote:
>
>> Hi, all. I want to implement a udf which is used to change a json string
>> to a map<string,string>.
>> But some problem occurs. My spark version:1.5.1.
>>
>>
>> my udf code:
>> ####################
>> public Map<String,String> evaluate(final String s) {
>> if (s == null)
>> return null;
>> return getString(s);
>> }
>>
>> @SuppressWarnings("unchecked")
>> public static Map<String,String> getString(String s) {
>> try {
>> String str =  URLDecoder.decode(s, "UTF-8");
>> ObjectMapper mapper = new ObjectMapper();
>> Map<String,String>  map = mapper.readValue(str, Map.class);
>> return map;
>> } catch (Exception e) {
>> return new HashMap<String,String>();
>> }
>> }
>> #############
>> exception infos:
>>
>> 16/05/14 21:05:22 ERROR CliDriver:
>> org.apache.spark.sql.AnalysisException: Map type in java is unsupported
>> because JVM type erasure makes spark fail to catch key and value types in
>> Map<>; line 1 pos 352
>> at
>> org.apache.spark.sql.hive.HiveInspectors$class.javaClassToDataType(HiveInspectors.scala:230)
>> at
>> org.apache.spark.sql.hive.HiveSimpleUDF.javaClassToDataType(hiveUDFs.scala:107)
>> at org.apache.spark.sql.hive.HiveSimpleUDF.<init>(hiveUDFs.scala:136)
>> ################
>>
>>
>> I have saw that there is a testsuite in spark says spark did not support
>> this kind of udf.
>> But is there a method to implement this udf?
>>
>>
>>
>

Re: spark udf can not change a json string to a map

Reply via email to