For context, the SalesGenerator was a schema and data profile from a large
company that stored their table in ORC. I had to scrub the details of the
table schema, but it generated random data that looked similar to their
production data.

.. Owen

On Wed, May 4, 2022 at 9:48 PM Owen O'Malley <owen.omal...@gmail.com> wrote:

> If you just want to test all of the types, I think
> examples/TestOrcFile.test1.orc will do it. It doesn't have much data in it,
> but it should test the functionality. You can also use the java/bench code
> to generate data. Look at the
> java/bench/core/src/java/org/apache/orc/bench/core/SalesGenerator.java
> class for how to generate arbitrary data.
>
> .. Owen
>
> On Wed, May 4, 2022 at 9:00 PM Larry White <ljw1...@gmail.com> wrote:
>
>> No worries. I appreciate the effort and understand the lack of time :)
>>
>> On Wed, May 4, 2022 at 4:29 PM Ian Kaplan <i...@bearcave.com> wrote:
>>
>>>
>>>   The library was in Maven.  I'm not sure why it's not there now.
>>>
>>>   I need to update the code to use the latest Java ORC base code and
>>> track down this issue. Unfortunately I don't have time to do this right
>>> now.  I apologize for any difficulties with the package.
>>>
>>>   Ian
>>>
>>>
>>> On Wed, May 4, 2022 at 4:12 PM Larry White <ljw1...@gmail.com> wrote:
>>>
>>>> Thanks, Ian.  This looks helpful and your API seems very nice.
>>>> Unfortunately, it doesn't seem like javaorc is on maven central currently.
>>>> I tried google search and maven centrals own search.
>>>>
>>>>
>>>>
>>>> On Wed, May 4, 2022 at 11:35 AM Ian Kaplan <i...@bearcave.com> wrote:
>>>>
>>>>>
>>>>>   I have written some tests for my JavaOrc library. I have not had a
>>>>> chance to update the library for the recent releases of the base Java Orc
>>>>> library.  My tests can be found here:
>>>>> https://github.com/IanLKaplan/javaorc/tree/master/src/test/java/com/topstonesoftware/javaorc
>>>>>
>>>>> On Wed, May 4, 2022 at 11:12 AM Larry White <ljw1...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I'm testing Arrow's ORC adaptor to see if there are data types it
>>>>>> doesn't handle. Are there any orc test files that cover all data types
>>>>>> (including complex types like union and lists)?  The initial goal is to
>>>>>> document any unsupported types, with a secondary goal of improving 
>>>>>> support
>>>>>> if any unsupported types are found.
>>>>>>
>>>>>> If someone knows of ORC tests that cover reading and all types that
>>>>>> would be extra awesome.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> larry
>>>>>>
>>>>>

Reply via email to