Just a thought from another angle:
Would it be possible for you to receive the files as csv (or any other text 
format)?
That would give you much more flexibility regarding columns.

Dec 2, 2022 2:13:35 p.m. Diego Mainou <[email protected]>:

> And you can use meta data injection to figure out the lot. 
> 
> There are plenty of Pentaho examples floating around.  
> 
> Will be great if someone could spend the time to replicar it in hop. 
> 
> Diego
> 
> 
> 
> Sent from my Galaxy
> 
> 
> -------- Original message --------
> From: Hans Van Akelyen <[email protected]>
> Date: 3/12/22 12:08 am (GMT+10:00)
> To: [email protected], [email protected]
> Subject: Re: Any trick to read data from Excel file where no of columns is 
> not known?
> 
> It’s not part of our distribution but there is a python transform [1]
> 
> Cheers,
> Hans
> 
> [1] https://github.com/m-a-hall/hop-cpython
> 
> On 2 December 2022 at 12:59:04, [email protected] ([email protected]) wrote:
> 
>> 
>> Thank you Matt,
>>  
>> that looks quite mosaic :-)
>> Maybe I'd try to use Python in the first stages. But there's no Python 
>> action in Hop :-(
>> There will be one day?
>> 
>> Regards
>>  
>>  
>> 
>> Sent: Thursday, December 01, 2022 at 4:28 PM
>> From: "Matt Casters" <[email protected]>
>> To: [email protected]
>> Subject: Re: Any trick to read data from Excel file where no of columns is 
>> not known?
>> 
>> A long time ago I did this in 'another' tool.
>>  
>> IIRC this is what's involved:
>> 1) Scan the Excel files and determine the sheets, number of columns, their 
>> names and data types
>> 1a) Sheets: leave the sheet name blank in the list, simply set start 
>> column/row to 0/0, include sheet name as an additional column in the output.
>> 1b) Columns: set a few hundred unnamed columns, all strings, read 1 one row. 
>>  The values are the names of the columns
>> 1c) Data types: write to a CSV file and use the "File Metadata" transform to 
>> get the types
>> 2) Inject this information into the Excel Input transform using ETL Metadata 
>> injection which also runs the pipeline.
>>  
>> Best of luck,
>> Matt
>>  
>>   
>> 
>> On Thu, Dec 1, 2022 at 3:12 PM <[email protected][mailto:[email protected]]> 
>> wrote:Hello,
>> 
>> do we have some way to read data from Excel file where number of columns is 
>> unknown?
>> I mean sometimes file can be like:
>> 
>> column_1; column_2
>> 
>> but other time
>> column_1; column_2; column_3; column_3
>> 
>> Normally we need to define them in 'Fields' tab - possible not to do that in 
>> a fixed way?
>> 
>> Regards
>>  
>>   
>>  
>> 
>>  
>>  
>>  
>>  
>>  

Reply via email to