You could try to use the TypeSerializerInputFormat.
​

On Thu, Jul 30, 2015 at 2:08 PM, Flavio Pompermaier <pomperma...@okkam.it>
wrote:

> How can I create a Flink dataset given a directory path that contains a
> set of java objects serialized with kryo (one file per object)?
>
> On Thu, Jul 30, 2015 at 1:41 PM, Till Rohrmann <trohrm...@apache.org>
> wrote:
>
>> Hi Flavio,
>>
>> in order to use the Kryo serializer for a given type you can use the
>> registerTypeWithKryoSerializer of the ExecutionEnvironment object. What
>> you provide to the method is the type you want to be serialized with kryo
>> and an implementation of the com.esotericsoftware.kryo.Serializer class.
>> If the given type is not supported by Flink’s own serialization framework,
>> then this custom serializer should be used. You register the types at the
>> beginning of your Flink program:
>>
>> def main(args: Array[String]): Unit = {
>>   val env = ExecutionEnvironment.getExecutionEnvironment
>>
>>   env.registerTypeWithKryoSerializer(classOf[MyType], 
>> classOf[MyTypeSerializer])
>>
>>   ...
>>
>>   env.execute()
>>
>> }
>>
>> Cheers,
>> Till
>> ​
>>
>> On Thu, Jul 30, 2015 at 12:45 PM, Flavio Pompermaier <
>> pomperma...@okkam.it> wrote:
>>
>>> I have a project that produce RDF quads and I have to store to read them
>>> with Flink afterwards.
>>> I could use thrift/protobuf/avro but this means to add a lot of
>>> transitive dependencies to my project.
>>> Maybe I could use Kryo to store those objects..is there any example to
>>> create a dataset of objects serialized with kryo?
>>>
>>> On Thu, Jul 30, 2015 at 11:10 AM, Stephan Ewen <se...@apache.org> wrote:
>>>
>>>> Quick response: I am not opposed to that, but there are tuple libraries
>>>> around already.
>>>>
>>>> Do you need specifically the Flink tuples, for interoperability between
>>>> Flink and other projects?
>>>>
>>>> On Thu, Jul 30, 2015 at 11:07 AM, Stephan Ewen <se...@apache.org>
>>>> wrote:
>>>>
>>>>> Should we move this to the dev list?
>>>>>
>>>>> On Thu, Jul 30, 2015 at 10:43 AM, Flavio Pompermaier <
>>>>> pomperma...@okkam.it> wrote:
>>>>>
>>>>>> Any thought about this (move tuples classes in a separate
>>>>>> self-contained project with no transitive dependencies so that to be 
>>>>>> easily
>>>>>> used in other external projects)?
>>>>>>
>>>>>> On Mon, Jul 6, 2015 at 11:09 AM, Flavio Pompermaier <
>>>>>> pomperma...@okkam.it> wrote:
>>>>>>
>>>>>>> Do you think it could be a good idea to extract Flink tuples in a
>>>>>>> separate project so that to allow simpler dependency management in
>>>>>>> Flin-compatible projects?
>>>>>>>
>>>>>>> On Mon, Jul 6, 2015 at 11:06 AM, Fabian Hueske <fhue...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> at the moment, Tuples are more efficient than POJOs, because POJO
>>>>>>>> fields are accessed via Java reflection whereas Tuple fields are 
>>>>>>>> directly
>>>>>>>> accessed.
>>>>>>>> This performance penalty could be overcome by code-generated
>>>>>>>> seriliazers and comparators but I am not aware of any work in that
>>>>>>>> direction.
>>>>>>>>
>>>>>>>> Best, Fabian
>>>>>>>>
>>>>>>>> 2015-07-06 11:01 GMT+02:00 Flavio Pompermaier <pomperma...@okkam.it
>>>>>>>> >:
>>>>>>>>
>>>>>>>>> Hi to all,
>>>>>>>>> I was thinking to write my own flink-compatible library and I need
>>>>>>>>> basically a Tuple5.
>>>>>>>>>
>>>>>>>>> Is there any performace loss in using a POJO with 5 String fields
>>>>>>>>> vs a Tuple5?
>>>>>>>>> If yes, wouldn't be a good idea to extract flink tuples in a
>>>>>>>>> separate simple project (e.g. flink-java-tuples) that has no other
>>>>>>>>> dependency to enable other libs to write their flink-compatible logic
>>>>>>>>> without the need to exclude all the transitive dependency of 
>>>>>>>>> flink-java?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Flavio
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>
>

Reply via email to