Re: [firebird-support] loading data from a .csv file?

2014-05-24 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
Den 2014-05-18 21:23 skrev Alan Shank li...@notoriousshankbrothers.com 
[firebird-support] såhär:

 Is there some way to load data into a table in Firebird, like LOAD DATA
 INFILE... in Mysql? I have comma-delimited files from Mysql that I want
 to load into Firebird tables.

 I have Firebird2.5-super on Ubuntu 14.04, 64-bit.


Thomas mentioned external tables. I've tried a couple of other data 
pump methods, both free and commercial import utilities as well as my 
own C# import loop (Windows and .Net in that case obviously). They all 
use a loop of insert statements, one for each .csv file row. While it 
works fine, it's generally very slow.

I've now switched completely to external tables, or in some cases (not 
so many records) generating insert statements and executing them in one 
query batch. It does require that you modify the file, but it's 
extremely fast!

I usually modify the format to fixed width with or without separator, 
matching a FB table with char columns (with an extra dummy column for 
each separator, if present, and an extra dummy column for the 
linebreak). It's easiest with a character set with a fixed number of 
bytes per character like iso 8859-1, ascii or win-1252, because FB 
expects each column to be the same number of bytes, not characters. I.e. 
with utf8 format it expects a char(10) column to consist of 40 byte in 
the file. The problem with that is that with text editors etc. such a 
string will appear as anything from 10 to 40 characters, depending on 
what characters you enter. If you enter ten 4 byte characters, that's 
the same byte length as 40 1 byte characters. So, such a file will not 
be fixed-width if you count characters. If you convert the format with 
some utility, you have to pad the strings to 40 byte, taking care to 
really count byte, not characters. With fixed width character sets, it's 
much easier.

Then use an external table like this (if separators are present and line 
breaks in your system are 1 single character)

create table External external file 'yourfilepath' (
   DataCol1 char(10) character set iso88591,
   Sep1 char(1) character set iso88591,
   DataCol2 char(25) character set iso88591,
   Sep2 char(1) character set iso88591,
   DataCol3 char(7) character set iso88591,
   Linebreak char(1) character set iso88591
);

Then import into your real table making sure the char columns are 
appropriately converted:

insert into MyTable (
   SomeString, -- varchar(10)
   SomeDate, -- datetime
   SmallInteger -- int
)
select trim(cast(DataCol1 as varchar(10)) SomeString,
   cast(DataCol2 as datetime) SomeDate,
   cast(DataCol3 as int) SmallInteger;

Good luck!
Kjell



Re: [firebird-support] Backups and the Delta File

2014-10-14 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]

russ...@smxi.com [firebird-support] skrev:
We have a large database that runs a full nbackup every night and an 
hourly incremental backup. Is the full backup meant to clear the delta 
file? this doesn't seem to get any smaller.


Also if I restore a gbak backup, what should I do with the delta file?



If you have an active delta file, you have the database in nbackup 
locked state, either because an nbackup operation terminated abnormally, 
or you manually locked the database and neglected to unlock it. The 
delta file is always merged into the live database immediately upon unlock.


On restore, there should be no delta file involved at all. There can be 
multiple nabckups, one for each level you've backed up, but that's a 
different thing.


Regards,
Kjell


[firebird-support] Performance diff between insert...select and for select ... do?

2015-02-13 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
Hi,

I'm writing a utility that will need to do two things for each record in 
an external table and for this purpose I use a for select ... do 
construct in an execute block. I do it this way because external tables 
can't be indexed and I will scan the entire external table anyway.

The two operations are:
1. Update one existing record in the target table.
2. Insert new record in the same target table.

In steady state the target table will contain about 20 million records 
and the external table will contain about 10 thousand records.

But the first time I run this, the target table will be empty and the 
external table will contain about 18 million records. The update will 
never find a record to update during this first execution.

Would I lose a lot of hours if I use the same execute block/for select 
construct the first time? The alternative would be to do a regular 
insert into target table select from externaltable the first time. My 
guess is that the unecessary update will take increasingly long to 
execute as the target table grows towards 18 milliion records, wasting a 
lot of time for me, even if a suitable index is present.

Just to give a measure of the system's performance as it is, a simple 
update on one column for 18 million records in the target table 
currently seems to take about 6 hours. But I could drop a couple of 
indices and perhaps I should bump up the page size from 4k to 8k or 16k.

Regards,
Kjell
-- 

Kjell Rilbe
Telefon: 08-761 06 55
Mobil: 0733-44 24 64

DataDIA AB
Ulvsundavägen 106
168 67 Bromma
www.datadia.se http://www.datadia.se
08-514 905 90

Företagskontakt.se http://xn--fretagskontakt-vpb.se - urval av företag 
och kontaktinformation
Personkontakt.se http://personkontakt.se - urval av hushållsadresser




Re: [firebird-support] Performance diff between insert...select and for select ... do?

2015-02-15 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
Mark Rotteveel m...@lawinegevaar.nl [firebird-support] skrev:

 On 13-2-2015 14:55, Kjell Rilbe kjell.ri...@datadia.se
 [firebird-support] wrote:
  I'm writing a utility that will need to do two things for each record in
  an external table and for this purpose I use a for select ... do
  construct in an execute block. I do it this way because external tables
  can't be indexed and I will scan the entire external table anyway.
 
  The two operations are:
  1. Update one existing record in the target table.
  2. Insert new record in the same target table.

 Maybe you can use MERGE, it uses a source, a target and can
 conditionally insert, update (and delete) record? I assume your target
 table isn't external and is indexed.


Thanks, that might be worth looking into. I suspect it won't work since 
I will need to do BOTH update AND insert for each external record, but I 
haven't checked what merge can do yet... :-)

Kjell
 -- 
 --
 Kjell Rilbe
 DataDIA AB
 E-post: kjell.ri...@datadia.se
 Telefon: 08-761 06 55
 Mobil: 0733-44 24 64


Re: [firebird-support] Extract week of the year (within year)

2015-03-24 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
=?utf-8?B?U3ZlaW4gRXJsaW5nIFR5c3bDpnI=?= 
svein.erling.tysv...@kreftregisteret.no [firebird-support] skrev:

 is possible to get week of the year within year?

 I see that built in functions only support ISO-8601 (or i do not know 
 some parameter)

 Actually, Karol, “week within year” is not the week, Karol. 1 January 
 could be week 53 and 31 December could be week 1. If you want to 
 consider 1 January always as part of week 1 (even if that day could be 
 the only day of week 1), I’d suggest you try something like:

 select (extract(yearday from cast('1.1.2014' as date))-extract(weekday 
 from cast('1.1.2014' as date)))/7+1

 from rdb$database


 If you want the first seven days of a week to always be week 1 (and 
 changing which day a week start with for each year), you could drop 
 the second EXTRACT.


Also, note that the rules for which week should be week number 1 is 
slightly different for different locales. As far as I know, there are 
two different rules, but there could be more. (What about locales where 
Sunday is considered first day of the week, as opposed to Monday?)

So, id you want to calculate the official week number according to a 
specific locale's calendar rules, you'll have to look this up 
thoroughly. I think Excel's week number function gets it right now 
(which hasn't always been the case).

Kjell


Re: [firebird-support] Performance diff between insert...select and for select ... do?

2015-04-13 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
Kjell Rilbe kjell.ri...@datadia.se [firebird-support] skrev:

 Hi,

 I'm writing a utility that will need to do two things for each record in
 an external table and for this purpose I use a for select ... do
 construct in an execute block. I do it this way because external tables
 can't be indexed and I will scan the entire external table anyway.

 The two operations are:
 1. Update one existing record in the target table.
 2. Insert new record in the same target table.

 In steady state the target table will contain about 20 million records
 and the external table will contain about 10 thousand records.

 But the first time I run this, the target table will be empty and the
 external table will contain about 18 million records. The update will
 never find a record to update during this first execution.

 Would I lose a lot of hours if I use the same execute block/for select
 construct the first time? The alternative would be to do a regular
 insert into target table select from externaltable the first time.


As a follow-up to this question, my tests with real data showed that the 
execute block was *very* much slower than a simple insert from the 
external table for the initial import into an empty target table. I 
think it was something like 10 minutes vs. 10 hours, give or take...

I also noted that in steady state, the Firebird solution as a whole was 
very slow. The thing is that for each run of this utility, I would need 
to visit close to 100 % of the records. I ended up tossing Firebird 
altogether and implemented a simple text file format instead, and a 
load-modify-write pattern, making good use of abundant RAM. Went from 
5-10 hours to about 5 minutes.

SQL databases are good for many things, but in this case, they suck (I'm 
assuming Firebird is not significantly worse than any other brand in 
this case).

Regards,
Kjell


Re: Odp: [firebird-support] Performance diff between insert...select and for select ... do?

2015-04-14 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
'=?utf-8?B?bGl2aXVzbGl2aXVzQHBvY3p0YS5vbmV0LnBs?=' 
liviusliv...@poczta.onet.pl [firebird-support] skrev:

 Why not merge?

Yes, indeed. Might have been worth a shot, but I suspect it would have 
been no more than marginally better. Even merge does have to do an index 
lookup for each record to check if one exists or not, doesn't it? Or 
does it ignore the index lookup and simply add a new record version 
regardless?

Furthermore, even the straight insert (with PK and indices 
dropped/inactive) was orders of magnitude slower than the proprietary 
text file I resorted to (no surprise really).

Regards,
Kjell


Re: Odp: [firebird-support] Performance diff between insert...select and for select ... do?

2015-04-14 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
'=?utf-8?B?bGl2aXVzbGl2aXVzQHBvY3p0YS5vbmV0LnBs?=' 
liviusliv...@poczta.onet.pl [firebird-support] skrev:

 Why not merge?

Yes, indeed. Might have been worth a shot, but I suspect it would have 
been no more than marginally better. Even merge does have to do an index 
lookup for each record to check if one exists or not, doesn't it? Or 
does it ignore the index lookup and simply add a new record version 
regardless?

Furthermore, even the straight insert (with PK and indices 
dropped/inactive) was orders of magnitude slower than the proprietary 
text file I resorted to (no surprise really).

Regards,
Kjell
-- 
--
Kjell Rilbe
DataDIA AB
E-post: kjell.ri...@datadia.se
Telefon: 08-761 06 55
Mobil: 0733-44 24 64


[firebird-support] Charset of external table filename?

2015-09-24 Thread Kjell Rilbe kjell.ri...@datadia.se [firebird-support]
Hi,

I'm trying to create an external table to import a list of id numbers 
like this:

create table "ExLöpnr" external file 'D:\Löpnr.txt' (
   "Löpnr" char(7) character set win1252,
   "CRLF" char(2) character set win1252
);

But when I try to select from it, Firebird creates a new empty file 
named "LöPNR.TXT". This is what "LÖPNR.TXT" looks like if it's UTF8 
encoded but displayed as if it were WIN1252/ANSI/LATIN1 encoded.

Checking the system tables, the filename looks correct, but I assume the 
system tables encode that string in UTF8 (or UNICODE-FSS which would be 
identical in this case if I'm not mistaken).

It seems like Firebird does not transliterate this string correctly when 
passing it to the operating system's file operations.

This is in FB version 2.5.2.26540.

Known problem/bug? Or am I missing something? Fixed in later version of 
2.5 or in version 3? Any workaround except renaming the file to plain ascii?

Regards,
Kjell

-- 
--
Kjell Rilbe
DataDIA AB
E-post: kjell.ri...@datadia.se
Telefon: 08-761 06 55
Mobil: 0733-44 24 64