Re: [Firebird-devel] Transactional Metadata (was: Planning the post v3 development)

2014-05-02 Thread Dimitry Sibiryakov
29.04.2014 12:48, Alex Peshkoff wrote:
> It's about reading table data from page cache when prepare gets slower
> many times. If system table is not in page cache it will be hundreds
> times slower.

   Yes, this particular piece of code. But how big % of execution time it takes 
in greater 
picture?

> If you think that I give too pessimistic estimation here please compare
> searching data in btree index + analyzing data page with record version
> check + analysis for GC (all this done locking appropriate pages for
> read) on one side with finding record in an array by index on another
> side (that's how metadata cache works).

   Of course finding of record in array by index is faster. But array is not 
versioned. 
Current metadata cache schema can be happily used on transaction level if 
transaction has 
snapshot IL. It is only RC which is going to be a problem.

> You will see that provided estimation is correct.

   Look at list_stayng(). I'm afraid that with this monster in background, all 
other 
expenses are ignorable.

-- 
   WBR, SD.

--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


[Firebird-devel] [FB-Tracker] Created: (CORE-4416) FB craches when attempt to create index with key length > 3375 bytes (charset = NONE) when database page_size = 16384

2014-05-02 Thread Pavel Zotov (JIRA)
FB craches when attempt to create index with key length > 3375 bytes (charset = 
NONE) when database page_size = 16384
-

 Key: CORE-4416
 URL: http://tracker.firebirdsql.org/browse/CORE-4416
 Project: Firebird Core
  Issue Type: Bug
Affects Versions: 3.0 Alpha 2
Reporter: Pavel Zotov


Script ('longkeys.sql'):

create database '192.168.0.220/:/var/db/fb30/tmpidxtest.fdb' page_size 
16384; 
commit;
connect '192.168.0.220/:/var/db/fb30/tmpidxtest.fdb';

recreate table t(id int primary key, s04 varchar(3375));  -- ok:3000; 3250;  
fails: 3500; 3375; 
commit;
recreate sequence g; 
commit;
create index t_s04 on t(s04);
commit;

show version;

select current_timestamp from rdb$database;
set stat on;
insert into t select i, left(s, 3375) from (select gen_id(g,1) i, rpad('', 
4000, uuid_to_char(gen_uuid())) s from rdb$types, rdb$types,(select 1 i from 
rdb$types rows 10)); 
set stat off;
select current_timestamp from rdb$database;
set stat on;
set echo on;
commit;

Test:

isql -i longkeys.sql 2>longkeys.err


Result:
##

1. Console:
=
ISQL Version: WI-V2.5.3.26730 Firebird 2.5
Server version:
Firebird/linux AMD64 (access method), version "LI-T3.0.0.31082 Firebird 3.0 
Alpha 2"
Firebird/linux AMD64 (remote server), version "LI-T3.0.0.31082 Firebird 3.0 
Alpha 2/tcp (oel64)/P12"
Firebird/x86/Windows NT (remote interface), version "WI-V2.5.3.26730 Firebird 
2.5/tcp (csprog)/P12"
on disk structure version 12.0

CURRENT_TIMESTAMP
=
2014-05-02 14:13:18.0310

Current memory = 5332562684358320
Delta memory = 5332553788549464
Max memory = 5332476777132512
Elapsed time= 218.33 sec
Buffers = 268543402
Reads = 39125124850654756
Writes 1152940647276085121
Fetches = 39180031719488332


2. File longkeys.err:


Statement failed, SQLSTATE = 08006
Error reading data from the connection.
After line 16 in file longkeys.sql
Statement failed, SQLSTATE = 08006
Unable to complete network request to host "192.168.0.220".
-Error writing data to the connection.
After line 18 in file longkeys.sql
Statement failed, SQLSTATE = 08006
Unable to complete network request to host "192.168.0.220".
-Error writing data to the connection.
After line 21 in file longkeys.sql
Statement failed, SQLSTATE = 08006
Unable to complete network request to host "192.168.0.220".
-Error writing data to the connection.
After line 22 in file longkeys.sql
Statement failed, SQLSTATE = 08006
Unable to complete network request to host "192.168.0.220".
-Error writing data to the connection.
After line 22 in file longkeys.sql


Stacktrace see in attach.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Feature request & discussion (Reply to Planning the post v3 development)

2014-05-02 Thread Leyne, Sean


> > Using an index may not help:
> >
> > 1- an index is stored without regard to physical disk location, so using it 
> > will
> create a huge amount of random disk IO.  Whereas a NATURAL scan follows
> the table.
> 
> It depends. Primary key may be stored more-or-less in regard to physical disk
> location. And the measure how good is INDEX vs NATURAL scan can be
> available to the optimizer (index clustering factor).

Without the knowledge of way that Primary Keys map to storage order an 
assumption based on same would be just as bad a assuming that UDF are by 
default deterministic, no?

Further, wouldn't an Index clustering factor really be of little value as the 
number of exceptions to the physical vs. primary key order increase?  Since any 
exception to key order would result in random IO, which can quickly kill system 
performance vs. natural scan.


> >> Index usage optimization II.
> >>  IS NOT NULL should use index. It is equivalent with >= min_value
> >> or <= max_value based on index direction
> >
> > I don't think this is possible.
> >
> > With an MVCC it is possible for all rows to have both a NULL and NOT
> > NULL values stored in the field index, so reading each rows is
> > required.  But as noted in #1 above, reading by index can lead to
> > significant disk IO/degradation
> 
> Who cares if the statistics tells us that NULLs are 95% of all keys?

Please clarify your context for "Who cares", I don't quite follow your point.


Sean


--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Feature request & discussion (Reply to Planning the post v3 development)

2014-05-02 Thread Jim Starkey
On 5/1/2014 11:45 PM, Dmitry Yemanov wrote:
> 2- In an version based database like Firebird each row will need to be read 
> to confirm the current value of the target field.
> It's not about version based databases, it's just about our index
> implementation. And there are possibilities to avoid record lookups even
> in our implementation.
>

OK, the alternative to record lookups is to store the transaction id in 
index.  This would require an index insertion for all indexes defined on 
a table even if the key value didn't change.  It would also require a 
corresponding index deletion for each index defined on the table when a 
record version was garbage collected.  The update performance would go 
straight down the toilet.  And this doesn't include the performance drop 
due to fluffed up indexes.

If you have a third alternative, by all means share it.

--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


[Firebird-devel] [FB-Tracker] Created: (CORE-4417) gbak: cannot commit index ; primary key with german umlaut

2014-05-02 Thread Oliver Wurdak (JIRA)
gbak: cannot commit index ;  primary key with german umlaut
---

 Key: CORE-4417
 URL: http://tracker.firebirdsql.org/browse/CORE-4417
 Project: Firebird Core
  Issue Type: Bug
  Components: GBAK
Affects Versions: 2.5.2 Update 1
 Environment: Windows7 64 Bit; firebird 64 Bit
Reporter: Oliver Wurdak


I create a database with the following isql script.  (CreateErrorDB.sql)
in the reduced table (1 field ,1 record) the value is XXÄ  (german umlaut A)

CREATE DATABASE 'leer.DB' USER 'SYSDBA' PASSWORD 'masterkey' DEFAULT CHARACTER 
SET WIN1252 collation PXW_INTL;
CREATE TABLE TABZL (ZLKENNZ VARCHAR(3) NOT NULL,PRIMARY KEY (ZLKENNZ));
INSERT INTO TABZL (ZLKENNZ) VALUES ('XXÄ');

this batchjob produces the error:
"%PROGRAMFILES%\Firebird\Firebird_2_5\bin\isql.exe" -q -i CreateErrorDB.sql
"%PROGRAMFILES%\Firebird\Firebird_2_5\bin\gbak.exe" -b leer.db test.fbak -USER 
SYSDBA -pas masterkey
"%PROGRAMFILES%\Firebird\Firebird_2_5\bin\gbak.exe" -v -user SYSDBA -pas 
masterkey -C test.fbak test.db

the 2nd gbak command produces this output (only the last lines)

gbak:creating indexes
gbak:activating and creating deferred index RDB$PRIMARY1
gbak:cannot commit index RDB$PRIMARY1
gbak: ERROR:connection lost to database
gbak: ERROR:Error writing data to the connection.
gbak:Exiting before completion due to errors
gbak: ERROR:Error writing data to the connection.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

   

--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Feature request & discussion (Reply to Planning the post v3 development)

2014-05-02 Thread Vlad Khorsun
> OK, the alternative to record lookups is to store the transaction id in 
> index.  This would require an index insertion for all indexes defined on 
> a table even if the key value didn't change.  It would also require a 
> corresponding index deletion for each index defined on the table when a 
> record version was garbage collected.  The update performance would go 
> straight down the toilet.  And this doesn't include the performance drop 
> due to fluffed up indexes.
> 
> If you have a third alternative, by all means share it.

Mark every index key with two tx numbers: 
- first, mandatory - is the number of tx that inserts this key (insert, update 
when key was changed), 
- second, optional - is the number of tx that deletes this index key (delete, 
update when key was changed).

- inserts will cost the same as now,
- updates will 
- if key was not changed - same cost as now (zero)
- if key was changed - twice cost as now (mark old key with current tx 
number, insert new key)
- delete will have additional cost to mark old key with current tx number
- undo of update and delete must additionally clear the mark for the old key 
- index keys will be more wide than now, there is some tricks to reduce new 
index keys length
- garbage collection will be the main winner - there is no need to process 
indices at the same time
  as table records. It allows to process every index independent from table and 
almost completely 
  eliminates random IO when records are removed. Currently, when table have 
more than 5-6 indices,
  garbage collection is terrible slow because of random IO.
- selects will have no need to read record version to evaluate record visibility
- also it allows to have index coverage (also requires to use such index key 
encoding which allows 
  to recover original value from index key)

Regards,
Vlad

--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Feature request & discussion (Reply to Planning the post v3 development)

2014-05-02 Thread Dmitry Yemanov
02.05.2014 22:03, Leyne, Sean wrote:
>
>> It depends. Primary key may be stored more-or-less in regard to physical disk
>> location. And the measure how good is INDEX vs NATURAL scan can be
>> available to the optimizer (index clustering factor).
>
> Without the knowledge of way that Primary Keys map to storage order an 
> assumption based on same would be just as bad a assuming that UDF are by 
> default deterministic, no?

I'm not talking about assumptions, the statistics either proves it or not.

> Further, wouldn't an Index clustering factor really be of little value as the 
> number of exceptions to the physical vs. primary key order increase?  Since 
> any exception to key order would result in random IO, which can quickly kill 
> system performance vs. natural scan.

We don't compare INDEX vs NATURAL, we compare INDEX vs SORT(NATURAL). 
There should be plenty random page jumps to exceed the external sort cost.

>>> With an MVCC it is possible for all rows to have both a NULL and NOT
>>> NULL values stored in the field index, so reading each rows is
>>> required.  But as noted in #1 above, reading by index can lead to
>>> significant disk IO/degradation
>>
>> Who cares if the statistics tells us that NULLs are 95% of all keys?
>
> Please clarify your context for "Who cares", I don't quite follow your point.

There's virtually no difference between scanning the index for IS NULL 
and for IS NOT NULL. MGA effects and costs are absolutely the same. The 
only thing that matters is the predicate selectivity. It's surely not a 
guarantee anyway -- all versions with non-NULL key may be invisible to 
the current transaction thus killing the performance. But once again, 
it's absolutely the same as for IS NULL or any other condition. I fail 
to see why you insist on the problem for IS NOT NULL only.


Dmitry


--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel


Re: [Firebird-devel] Feature request & discussion (Reply to Planning the post v3 development)

2014-05-02 Thread Jim Starkey
On 5/2/2014 3:25 PM, Vlad Khorsun wrote:
>> OK, the alternative to record lookups is to store the transaction id in
>> index.  This would require an index insertion for all indexes defined on
>> a table even if the key value didn't change.  It would also require a
>> corresponding index deletion for each index defined on the table when a
>> record version was garbage collected.  The update performance would go
>> straight down the toilet.  And this doesn't include the performance drop
>> due to fluffed up indexes.
>>
>> If you have a third alternative, by all means share it.
>  Mark every index key with two tx numbers:
> - first, mandatory - is the number of tx that inserts this key (insert, 
> update when key was changed),
> - second, optional - is the number of tx that deletes this index key (delete, 
> update when key was changed).
>
> - inserts will cost the same as now,
> - updates will
>  - if key was not changed - same cost as now (zero)
>  - if key was changed - twice cost as now (mark old key with current tx 
> number, insert new key)
> - delete will have additional cost to mark old key with current tx number
> - undo of update and delete must additionally clear the mark for the old key
> - index keys will be more wide than now, there is some tricks to reduce new 
> index keys length
> - garbage collection will be the main winner - there is no need to process 
> indices at the same time
>as table records. It allows to process every index independent from table 
> and almost completely
>eliminates random IO when records are removed. Currently, when table have 
> more than 5-6 indices,
>garbage collection is terrible slow because of random IO.
> - selects will have no need to read record version to evaluate record 
> visibility
> - also it allows to have index coverage (also requires to use such index key 
> encoding which allows
>to recover original value from index key)
>
I'm thinking.  I'll get back later.

--
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel