[sqlite] configure syntax error on HP
I'm trying to compile 3.5.9 on an HP-UX 11i v1 machine but I can't get past step one... I unpacked the amalgamation, cd'd to the directory, and ran './configure' - that didn't work so I tried 'sh ./configure' like the install instructions suggest. I got a syntax error right away. ~/sqlite/sqlite-amalgamation-3.5.9.tar/sqlite-amalgamation-3.5.9 lacpghp1> ./configure interpreter "/bin/sh" not found file link resolves to "/usr/bin/sh" bash: ./configure: /bin/sh^M: bad interpreter: No such file or directory ~/sqlite/sqlite-amalgamation-3.5.9.tar/sqlite-amalgamation-3.5.9 lacpghp1> sh ./configure ./configure[14]: ^M: not found. ' is not expected.yntax error at line 26 : `in I haven't changed the configure file at all... the block in question is : 15# Be more Bourne compatible 16DUALCASE=1; export DUALCASE # for MKS sh 17if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then 18 emulate sh 19 NULLCMD=: 20 # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which 21 # is contrary to our usage. Disable this feature. 22 alias -g '${1+"$@"}'='"$@"' 23 setopt NO_GLOB_SUBST 24else 25 case `(set -o) 2>/dev/null` in 26 *posix*) set -o posix ;; 27esac 28 29fi Any ideas on what could be wrong here? Thanks for any input! ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] configure syntax error on HP
>On Jun 18, 2008, at 5:38 PM, Andrea Connell wrote: > >> I'm trying to compile 3.5.9 on an HP-UX 11i v1 machine but I can't get >> past step one... >> >> I unpacked the amalgamation, cd'd to the directory, and ran >> './configure' - that didn't work so I tried 'sh ./configure' like the >> install instructions suggest. I got a syntax error right away. >> >> ~/sqlite/sqlite-amalgamation-3.5.9.tar/sqlite-amalgamation-3.5.9 >> lacpghp1> ./configure >> interpreter "/bin/sh" not found > >A system without /bin/sh hardly qualifies as Unix, does it? > >Looks like you are going to need to compile it yourself. Ignore the >configure script. Just type something like this: > > cc -o sqlite3 -DSQLITE_THREADSAFE=0 - >DSQLITE_OMIT_LOAD_EXTENSION=1 sqlite3.c shell.c > >You might need to add some "-l..." arguments on the end to specify >libraries, but perhaps not. The -DSQLITE_THREADSAFE=0 eliminates the >needs for pthreads and -DSQLITE_OMIT_LOAD_EXTENSION=1 removes the >requirement for dlopen, and on most systems those are the only two >libraries required. But if your system doesn't have a Bourne shell, >who knows what other peccadillos lurk around the next corner... > >D. Richard Hipp >drh at hwaci.com <http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users> Yea it's pretty scary. I'm not sure if something is screwy on the box itself or if it's just my account. Either way I'm hoping I can work around it without too much pain and suffering. I tried your suggestion and it didn't give any errors $ cc -o sqlite3 -DSQLITE_THREADSAFE=0 -DSQLITE_OMIT_LOAD_EXTENSION=1 sqlite3.c shell.c sqlite3.c: shell.c: $ I can run the CLI now, but there must be more to do. Ultimately I want to use the C API with a C++ class but when I try compiling... $ aCC -AA +W829 main.cpp sqlite3.c main.cpp: sqlite3.c: Error 482: "sqlite3.c", line 532 # Array of unknown size; 'const char []' is incomplete. SQLITE_API const char sqlite3_version[]; ^^^ Error 204: "sqlite3.c", line 6464 # Uninitialized const variable "const int sqlite3one". SQLITE_PRIVATE const int sqlite3one; ^^ Error 203: "sqlite3.c", line 10309 # Cannot assign 'char *' with 'void *'. z = sqlite3_malloc( n ); ^^^ Error 203: "sqlite3.c", line 11101 # Cannot assign 'long long *' with 'void *'. p = malloc(nBytes+8); Error 203: "sqlite3.c", line 11104 # Cannot assign 'long long *' with 'void *'. p = malloc(nBytes+8); Error 203: "sqlite3.c", line 11130 # Cannot assign 'long long *' with 'void *'. p = pPrior; ^^ Error 203: "sqlite3.c", line 11145 # Cannot assign 'long long *' with 'void *'. pInt = p; ^ Error 203: "sqlite3.c", line 11162 # Cannot assign 'long long *' with 'void *'. p = pPrior; ^^ Error 203: "sqlite3.c", line 11173 # Cannot assign 'long long *' with 'void *'. p = realloc(p, nBytes+8); Error 203: "sqlite3.c", line 11176 # Cannot assign 'long long *' with 'void *'. p = pPrior; ^^ Error 203: "sqlite3.c", line 11178 # Cannot assign 'long long *' with 'void *'. p = realloc(p, nBytes+8); Error 419: "/usr/include/sys/pset.h", line 186 # 'spu_t' is used as a type, but has not been defined as a type. extern int pset_assign ( psetid_t pset, spu_t spu, psetid_t* opset) ^ Error 699: "/usr/include/sys/pset.h", line 186 # Error limit reached; halting compilation. extern int pset_assign ( psetid_t pset, spu_t spu, psetid_t* opset) ^ $ Blech. Can I have a clue to my next step? Thanks, Andrea ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] configure syntax error on HP
Okay that worked, thanks. Just curious though... I've compiled C and C++ code together many times. I've never had a problem before since C is basically a subset of C++. Why doesn't it work here? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of D. Richard Hipp Sent: Wednesday, June 18, 2008 6:58 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] configure syntax error on HP On Jun 18, 2008, at 7:12 PM, Andrea Connell wrote: > I want to use the C API with a C++ class but when I try compiling... > > $ aCC -AA +W829 main.cpp sqlite3.c > main.cpp: > sqlite3.c: > Error 482: "sqlite3.c", line 532 # Array of unknown size; 'const char SQLite is written in C, not C++. You have to use a C compiler to compile it. If you compile to object code, you can normally link it against C++ code without difficulty. But you cannot compile SQLite directly using a C++ compiler. D. Richard Hipp [EMAIL PROTECTED] ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Performance on HP
Now that I have SQLite compiled on HP, I am starting to test performance. So far it's pretty disappointing though. I am comparing performance of SQLite versus an in-house directory access system. I have the same table structure and data for each of them. The code reads some data from an input file and searches the database for that row, then finds all rows from other tables associated with the first row. On Windows, with a cold cache SQLite is a bit slower than the inhouse system, but after an initial select it runs twice as fast. On HP our inhouse system can run through 1000 rows in input in 4 seconds on average. SQLite is taking about 30 seconds with the same input to find the same results. CREATE TABLE LEVEL1 ( ADDRESS_TYPE text, COUNTRY_ID text, DATA_SOURCE text, DIR_SEARCH_AREA1 integer, ... PHONETIC_KEY text, ... , RECORDKEY integer); CREATE TABLE LEVEL2 ( ... PRIM_NBR_HIGH text, PRIM_NBR_LOW text, ... RECORDKEY integer, PARENT_KEY integer ); CREATE TABLE LEVEL3 ( ... RECORDKEY integer, ... PARENT_KEY integer ); CREATE INDEX L1_INDEX ON LEVEL1 (COUNTRY_ID, DIR_SEARCH_AREA1, ADDRESS_TYPE, PHONETIC_KEY); CREATE INDEX L2_RANGE_INDEX ON LEVEL2 (PARENT_KEY, PRIM_NBR_HIGH, PRIM_NBR_LOW); CREATE INDEX L3_INDEX ON LEVEL3 (PARENT_KEY); I also reordered on these indexes to create the clustered index. On Windows I had set the page_size PRAGMA to 4096 before creating the tables, but I think on HP they are 1024 so that wouldn't help? My program uses the following queries to find the info and binds the appropriate data from the input file. char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY >= ? AND PHONETIC_KEY < ? ;"; char * qry2 = "SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance on HP
Sorry somehow I sent that before I was quite finished. I'm just wondering if there is anything else I should try. About 30,000 rows are found in the end, and it can do this on Windows in less than a second. I'm convinced it shouldn't take 30 seconds on HP. I know the OS's caching method will make a difference, but should it be this dramatic? Thanks, Andrea -Original Message- From: Andrea Connell Sent: Friday, June 20, 2008 11:23 AM To: sqlite-users@sqlite.org Subject: Performance on HP Now that I have SQLite compiled on HP, I am starting to test performance. So far it's pretty disappointing though. I am comparing performance of SQLite versus an in-house directory access system. I have the same table structure and data for each of them. The code reads some data from an input file and searches the database for that row, then finds all rows from other tables associated with the first row. On Windows, with a cold cache SQLite is a bit slower than the inhouse system, but after an initial select it runs twice as fast. On HP our inhouse system can run through 1000 rows in input in 4 seconds on average. SQLite is taking about 30 seconds with the same input to find the same results. CREATE TABLE LEVEL1 ( ADDRESS_TYPE text, COUNTRY_ID text, DATA_SOURCE text, DIR_SEARCH_AREA1 integer, ... PHONETIC_KEY text, ... , RECORDKEY integer); CREATE TABLE LEVEL2 ( ... PRIM_NBR_HIGH text, PRIM_NBR_LOW text, ... RECORDKEY integer, PARENT_KEY integer ); CREATE TABLE LEVEL3 ( ... RECORDKEY integer, ... PARENT_KEY integer ); CREATE INDEX L1_INDEX ON LEVEL1 (COUNTRY_ID, DIR_SEARCH_AREA1, ADDRESS_TYPE, PHONETIC_KEY); CREATE INDEX L2_RANGE_INDEX ON LEVEL2 (PARENT_KEY, PRIM_NBR_HIGH, PRIM_NBR_LOW); CREATE INDEX L3_INDEX ON LEVEL3 (PARENT_KEY); I also reordered on these indexes to create the clustered index. On Windows I had set the page_size PRAGMA to 4096 before creating the tables, but I think on HP they are 1024 so that wouldn't help? My program uses the following queries to find the info and binds the appropriate data from the input file. char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY >= ? AND PHONETIC_KEY < ? ;"; char * qry2 = "SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance on HP
I was originally using LIKE but since that can't make use of indexing I found this as an alternative to attempt to speed up the query. So if I wanted to search PHONETIC_KEY LIKE 'ABCD%' I replace it with PHONETIC_KEY >= 'ABCD' AND PHONETIC_KEY < 'ABCE' -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, June 20, 2008 12:28 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Performance on HP Hi Andrea, I'm interested in your query: Andrea Connell wrote: > char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND > DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY >= ? AND > PHONETIC_KEY < ? ;"; > > char * qry2 = "SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND > PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; > char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; > Are there >= and < on purpose to do lexicographic string ordering? Or even some kind of Soundex code? Once they caused table scanning, SQLite page loading may increase? Best Regards, /Mike/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance on HP
I understand that the machines will perform differently, but that doesn't explain the differences I am seeing. On Windows SQLite can do 1,000 rows of input in less than a second and our in house system can do it in just over a second. They are fairly comparable at this point. When I increase the rows of input to 10,000 SQLite can do it in 5 seconds and our in house system takes 12. 100,000 rows are done in 54 seconds and 121 seconds, respectively. So from these results you can see that SQLite is about twice as fast as our system on Windows. However, on HP SQLite takes 30 seconds to do 1,000 rows and our system can do them in just 4 seconds. Clearly it is not the machine itself that is 30 times slower. If our system took 60 seconds to perform the queries, it would make sense but that is not at all the case. That is why I am wondering if there is something else in SQLite that might be causing the drastic slow down on HP. Sorry if that wasn't totally clear from my first message. Any thoughts? Andrea -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of John Stanton Sent: Friday, June 20, 2008 12:08 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] Performance on HP You are measuring the speed of the respective machines. Benchmark each one to get relative performance. Andrea Connell wrote: > > Sorry somehow I sent that before I was quite finished. I'm just > wondering if there is anything else I should try. About 30,000 rows > are found in the end, and it can do this on Windows in less than a second. > I'm convinced it shouldn't take 30 seconds on HP. I know the OS's > caching method will make a difference, but should it be this dramatic? > > Thanks, > Andrea > > -----Original Message- > From: Andrea Connell > Sent: Friday, June 20, 2008 11:23 AM > To: sqlite-users@sqlite.org > Subject: Performance on HP > > > Now that I have SQLite compiled on HP, I am starting to test > performance. So far it's pretty disappointing though. > > I am comparing performance of SQLite versus an in-house directory > access system. I have the same table structure and data for each of > them. The code reads some data from an input file and searches the > database for that row, then finds all rows from other tables > associated with the first row. > > On Windows, with a cold cache SQLite is a bit slower than the inhouse > system, but after an initial select it runs twice as fast. > On HP our inhouse system can run through 1000 rows in input in 4 > seconds on average. SQLite is taking about 30 seconds with the same > input to find the same results. > > CREATE TABLE LEVEL1 ( ADDRESS_TYPE text, COUNTRY_ID text, DATA_SOURCE > text, DIR_SEARCH_AREA1 integer, ... PHONETIC_KEY text, ... , > RECORDKEY integer); > > CREATE TABLE LEVEL2 ( ... PRIM_NBR_HIGH text, PRIM_NBR_LOW text, ... > RECORDKEY integer, PARENT_KEY integer ); > > CREATE TABLE LEVEL3 ( ... RECORDKEY integer, ... PARENT_KEY integer > ); > > CREATE INDEX L1_INDEX ON LEVEL1 (COUNTRY_ID, DIR_SEARCH_AREA1, > ADDRESS_TYPE, PHONETIC_KEY); CREATE INDEX L2_RANGE_INDEX ON LEVEL2 > (PARENT_KEY, PRIM_NBR_HIGH, PRIM_NBR_LOW); CREATE INDEX L3_INDEX ON > LEVEL3 (PARENT_KEY); > > I also reordered on these indexes to create the clustered index. > > On Windows I had set the page_size PRAGMA to 4096 before creating the > tables, but I think on HP they are 1024 so that wouldn't help? > > My program uses the following queries to find the info and binds the > appropriate data from the input file. > > char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND > DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY >= ? AND > PHONETIC_KEY < ? ;"; char * qry2 = "SELECT * FROM LEVEL2 WHERE > PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; > char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance on HP
>Have you tried compiling with the profiler and seeing where the time is being spent? I compiled with the profiler and used prof to analyze the mon.out file. The program took 47 seconds to run, but the results only account for .39 seconds I do compile sqlite3.c into sqlite3.o then link it into my project, but I compiled that with profiling too. I'm not sure if that will make a difference. Does this give anybody ideas, or is there something else I should try? %Time Seconds Cumsecs #Calls msec/call Name 1.10.080.08 sqlite3_result_int 0.90.070.15 10754360.00 _Z19ReadSQLiteComponentiiP12sqlite3_stmt 0.90.070.22 sqlite3_value_text 0.50.040.26 1 40.00 main 0.50.040.30 sqlite3_context_db_handle 0.30.020.32 sqlite3_column_text 0.30.020.34 sqlite3_free 0.10.010.35 1010010.00 _ZNSi4readEPclii 0.10.010.3610010.01 _Z8ReadLineRSt14basic_ifstreamIcSt11char_traitsIcEE 0.10.010.37 sqlite3_bind_text 0.10.010.38 sqlite3_column_int 0.10.010.39 sqlite3_step 0.00.000.39 2698630.00 _ZNSi13_C_unsafe_getEPlii 0.00.000.39 1010010.00 _ZNSi7_C_ipfxEi 0.00.000.39 5320.00 _ZN8GDRecord8addFieldEPK7GDField 0.00.000.39 5320.00 _ZNKSt7codecvtIcc9mbstate_tE16do_always_noconvEv 0.00.000.39 5320.00 _ZNSt13basic_filebufIcSt11char_traitsIcEE9underflowEv 0.00.000.39 80.00 _ZNSt13basic_filebufIcSt11char_traitsIcEE4syncEv 0.00.000.39 40.00 _ZNKSt5ctypeIwE8do_widenEc 0.00.000.39 40.00 _ZNSt13basic_filebufIcSt11char_traitsIcEE6setbufEPcl 0.00.000.39 20.00 _ZN4__rw15__rw_facet_base12_C_initfacetERKSt6locale 0.00.000.39 10.00 _ZN4__rw11__rw_file_t15_C_get_mode_argEi 0.00.000.39 10.00 _ZN4__rw15__rw_mutex_baseD2Ev 0.00.000.39 10.00 _ZN4__rw16__rw_facet_makerISt5ctypeIcEE13_C_maker_funcEiPKcm 0.00.000.39 10.00 _ZN4__rw16__rw_facet_makerISt7codecvtIcc9mbstate_tEE13_C_maker_funcEiPKc m 0.00.000.39 10.00 _ZNSiD2Ev 0.00.000.39 10.00 _ZNSt13basic_filebufIcSt11char_traitsIcEE4openEPKcil 0.00.000.39 10.00 _ZNSt13basic_filebufIcSt11char_traitsIcEE5closeEv 0.00.000.39 10.00 _ZNSt13basic_filebufIcSt11char_traitsIcEEC1Ev 0.00.000.39 10.00 _ZNSt9basic_iosIcSt11char_traitsIcEED2Ev ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance on HP
>> The program took 47 seconds to run, but the results only account for >> .39 seconds > > Most likely all the time is being spent in IO related system calls > - read(), write() and fsync(). > > Dan. Thanks for the idea Dan. How can I confirm this or try reducing the time spent? I use the same method for reading my input file when I run both SQLite and our in house system, and the other way only takes 4 seconds total so I don't think it could be from that. Also, when I run our in house system and use the profiler the time spent adds up to 100% So this must be something within SQLite. I am using a transaction for my queries. How can I find all of that missing time? If anybody is interested, here is my main chunk of code. ReadLine() parses the input file and fills the required variables. This method is shared for both database systems (SQLite and ours). ReadSQLiteComponent just calls one of the sqlite3_column functions based on the type of the field, and a similar method is used for our system. std::ifstream inf(argv[1]); sqlite3 *db; sqlite3_stmt *stmt; sqlite3_stmt *stmt2; sqlite3_stmt *stmt3; int rc = sqlite3_open(argv[3], ); if( rc ) { printf("Can't open database: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); return -1; } sqlite3_prepare(db,"BEGIN TRANSACTION;", 100, ,0); sqlite3_step(stmt); char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY LIKE ? ;"; int p = sqlite3_prepare_v2(db,qry,1000,,0); char * qry2 = "SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; int p2 = sqlite3_prepare_v2(db,qry2,1000,,0); char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; int p3 = sqlite3_prepare_v2(db,qry3,1000,,0); if ( p || p2 || p3 ) { printf("Can't create prepared statement: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); return -1; } while (ReadLine(inf)) { sqlite3_bind_text(stmt, 1, cntryid, -1, SQLITE_TRANSIENT); sqlite3_bind_int(stmt, 2, searcharea); sqlite3_bind_text(stmt, 3, addrtype, -1, SQLITE_TRANSIENT); int len = strlen(phnkey); phnkey[len] = '%'; phnkey[len+1] = '\0'; sqlite3_bind_text(stmt, 4, phnkey, -1, SQLITE_TRANSIENT); while(sqlite3_step(stmt)==SQLITE_ROW) { for(int i=0; i
Re: [sqlite] sqlite DB creation performance
Creation time would be a bit faster if you don't create the index until AFTER importing the data. That way it doesn't have to worry about keeping things in their proper order on every insert. -Andrea -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brooks, Phil Sent: Thursday, June 26, 2008 1:58 PM To: sqlite-users@sqlite.org Subject: [sqlite] sqlite DB creation performance Hi, I am new to the list and to using sqlite - I am looking at using sqlite for a 2 way lookup table for my application. I have a home grown file based solution, but was hoping to find better performance and more functionality by putting the lookup tables into sqlite. I am looking for the fastest way possible to create a large 2 way lookup dictionary for an application. While I am creating the database, I don't need the database to be in a consistent or usable state, I just need it to be that way when I get done. If something bad happens along the way, that is OK. In that case, the database can be recreated from scratch. I have been playing around with options (page size, turn off journaling, different indexes) that have improved the database creation phase quite a bit and am wondering if the group can provide additional insight: More Detail --- The tables simply provide fast lookup to return a string on the left-side when presented with its right-side pair and visa versa. Both strings (left and right) belong to a group of entries I'll call 'group'. The groups are all independent of one another. The database contains 2 of these cross reference lookup tables and a bunch of smaller stuff that helps make sense of things, but doesn't really matter a lot in terms of size and performance. The two dictionary tables are pretty big -- the testcase I am using has one table with 43 million entries and another table with 7 million entries. To figure out the performance of this system under sqlite, I am just putting the two dictionaries into a sqlite database that looks like this: First Try - /* Create the tables: */ CREATE TABLE foo(left string, right string, group int, flags int); CREATE TABLE bar(left string, right string, group int); /* Create the indices: */ CREATE INDEX left_foo_dict on foo ( left ); CREATE INDEX right_foo_dict on foo ( right ); CREATE INDEX left_bar_dict on bar( left ); CREATE INDEX right_bar_dict on bar( right ); /* Import the data: */ .separator , .import foo.csv foo .import bar.csv bar The strings in the data range in size from about 2 characters up to about 60, but they are, theoretically, unlimited in size. My ascii input csv files are about 5GB for foo.csv and about 500MB for bar.csv. My old database format requires about 7GB to store the database and it takes about 30-40 minutes to create. So I ran the above script: $ time sqlite3 db.sql < script1 real2h26m47.68s user49m59.09s sys 4m47.75s sql file size is about 14GB. OK, that is a starting point 5-6x the creation time and twice the space of my original format. On the bright side, opening the sqlite database is vastly faster than on my old format (instant vs. ~16min) and lookups are about the same, maybe slightly faster in some cases. Second Try I looked at the docs and found a few intersting pragmas - since the csv data is naturally ordered on the group field, I decided to see if creating a dual key would help in any way: /* First I tried some pragmas that looked promising */ PRAGMA journal_mode = OFF; PRAGMA legacy_file_format = OFF; PRAGMA page_size=32768; /* Create the tables: */ CREATE TABLE foo(left string, right string, group int, flags int); CREATE TABLE bar( left string, right string, group int ); /* Next, I tried adding an ascending key on group */ CREATE INDEX left_foo_dict on foo ( group asc, left ); CREATE INDEX right_foo_dict on foo ( group asc, right ); CREATE INDEX left_bar_dict on bar( group asc, left ); CREATE INDEX right_bar_dict on bar( group asc, right ); /* import the data */ .separator , .import foo.csv foo .import bar.csv bar Slight improvement in size - down to about 13GB now. Run time improved a lot - down to about 1 hour, though I don't really know which of the things I tried was responsible for the improvement. -- I'll try them one at a time at some point to see which actually helped. Now the Question Can anyone suggest additional things I might try to: 1) reduce db creation time - or - 2) reduce the size of the file? I could actually live with a larger than 7GB file size if I could beat the 30-40 minute runtime of my original solution by a resonable margin. Also, I am assuming that importing a .csv file using .import will be similar in speed to creating the database using the c++ interface using the sqlite3_prepare/sqlite3_bind/sqlite3_step/sqlite3_reset interfaces -- is that a valid assumption? Phil ___ sqlite-users mailing list sqlite-users@sqlite.org
Re: [sqlite] Performance on HP
I'm still struggling with this issue. I've tried compiling on Solaris and get similar results. On Windows SQLite is twice as fast as our in-house database system, but on HP and now Solaris it is over 6 times slower than ours. There must be something I am missing here. I've tried increasing the cache size and the page size but saw no real difference. It was still taking over 30 seconds to find 30,000 rows while our system does it in 4 seconds. The databases have the same schema and the queries are all using the proper index (I checked the explain output). CREATE TABLE LEVEL1 ( ADDRESS_TYPE text, COUNTRY_ID text, DATA_SOURCE text, DIR_SEARCH_AREA1 integer, ... PHONETIC_KEY text, ... , RECORDKEY integer); CREATE TABLE LEVEL2 ( ... PRIM_NBR_HIGH text, PRIM_NBR_LOW text, ... RECORDKEY integer, PARENT_KEY integer ); CREATE TABLE LEVEL3 ( ... RECORDKEY integer, ... PARENT_KEY integer ); CREATE INDEX L1_INDEX ON LEVEL1 (COUNTRY_ID, DIR_SEARCH_AREA1, ADDRESS_TYPE, PHONETIC_KEY); CREATE INDEX L2_RANGE_INDEX ON LEVEL2 (PARENT_KEY, PRIM_NBR_HIGH, PRIM_NBR_LOW); CREATE INDEX L3_INDEX ON LEVEL3 (PARENT_KEY); SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY LIKE ? ; SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?; SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ; Not sure if it will help at all, but once the database is created it will literally never be updated or have another row inserted. Is there any special optimization if we ONLY care about fast selects, and nothing else? Any ideas? It's driving me crazy why SQLite is this much slower on UNIX boxes, while other applications maintain their speed. Thanks from a desperate coder -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrea Connell Sent: Monday, June 23, 2008 11:48 AM To: General Discussion of SQLite Database Subject: Re: [sqlite] Performance on HP >> The program took 47 seconds to run, but the results only account for >> .39 seconds > > Most likely all the time is being spent in IO related system calls > - read(), write() and fsync(). > > Dan. Thanks for the idea Dan. How can I confirm this or try reducing the time spent? I use the same method for reading my input file when I run both SQLite and our in house system, and the other way only takes 4 seconds total so I don't think it could be from that. Also, when I run our in house system and use the profiler the time spent adds up to 100% So this must be something within SQLite. I am using a transaction for my queries. How can I find all of that missing time? If anybody is interested, here is my main chunk of code. ReadLine() parses the input file and fills the required variables. This method is shared for both database systems (SQLite and ours). ReadSQLiteComponent just calls one of the sqlite3_column functions based on the type of the field, and a similar method is used for our system. std::ifstream inf(argv[1]); sqlite3 *db; sqlite3_stmt *stmt; sqlite3_stmt *stmt2; sqlite3_stmt *stmt3; int rc = sqlite3_open(argv[3], ); if( rc ) { printf("Can't open database: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); return -1; } sqlite3_prepare(db,"BEGIN TRANSACTION;", 100, ,0); sqlite3_step(stmt); char * qry = "SELECT * FROM LEVEL1 WHERE COUNTRY_ID = ? AND DIR_SEARCH_AREA1 = ? AND ADDRESS_TYPE = ? AND PHONETIC_KEY LIKE ? ;"; int p = sqlite3_prepare_v2(db,qry,1000,,0); char * qry2 = "SELECT * FROM LEVEL2 WHERE PARENT_KEY = ? AND PRIM_NBR_LOW <= ? AND PRIM_NBR_HIGH >= ?;"; int p2 = sqlite3_prepare_v2(db,qry2,1000,,0); char* qry3 = "SELECT * FROM LEVEL3 WHERE PARENT_KEY = ? ;"; int p3 = sqlite3_prepare_v2(db,qry3,1000,,0); if ( p || p2 || p3 ) { printf("Can't create prepared statement: %s\n", sqlite3_errmsg(db)); sqlite3_close(db); return -1; } while (ReadLine(inf)) { sqlite3_bind_text(stmt, 1, cntryid, -1, SQLITE_TRANSIENT); sqlite3_bind_int(stmt, 2, searcharea); sqlite3_bind_text(stmt, 3, addrtype, -1, SQLITE_TRANSIENT); int len = strlen(phnkey); phnkey[len] = '%'; phnkey[len+1] = '\0'; sqlite3_bind_text(stmt, 4, phnkey, -1, SQLITE_TRANSIENT); while(sqlite3_step(stmt)==S
Re: [sqlite] Performance on HP
>On Jun 30, 2008, at 2:37 PM, Andrea Connell wrote: > >> Any ideas? It's driving me crazy why SQLite is this much slower on >> UNIX >> boxes, while other applications maintain their speed. > >What filesystem are you using on the unix boxes? Are you *sure* you >are not using NFS? > >D. Richard Hipp >[EMAIL PROTECTED] > You know what, I'm not sure but I'd be willing to bet I am using NFS. I'll check around and see if there's anything else I can move it to. Thanks ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] View with Dynamic Fields ?
I'm not sure if what I want is possible to do in SQL, but I -am- sure that one of you will know. Given two tables - one with questions to ask applicants and one with an applicant's answer to a particular question - I want to make a flattened view with all of an applicant's answers in one row. This is easy enough to do when I know ahead of time which questions are in the first table, but I don't like the hard-coded approach. My schema is as follows: CREATE TABLE tblquestions (questionid int, question varchar(100)); INSERT INTO "tblquestions" VALUES(1,'whats up'); INSERT INTO "tblquestions" VALUES(2,'how are you'); INSERT INTO "tblquestions" VALUES(3,'whats your name'); CREATE TABLE tblanswers (questionid int, applicantid int, answer varchar(2500)); INSERT INTO "tblanswers" VALUES(1,100,'stuff for answer one'); INSERT INTO "tblanswers" VALUES(2,100,'stuff for answer two'); INSERT INTO "tblanswers" VALUES(3,100,'stuff for answer three'); INSERT INTO "tblanswers" VALUES(1,200,'random text one'); INSERT INTO "tblanswers" VALUES(2,200,'random text two'); INSERT INTO "tblanswers" VALUES(3,200,'random text three'); Here is the view I have come up with so far, which would require editing whenever an insert or delete is done on tblQuestions. CREATE VIEW allanswers as SELECT applicantid, (select answer from tblanswers Z where questionid = 1 and Z.applicantid = A.applicantid) As Answer1, (select answer from tblanswers Z where questionid = 2 and Z.applicantid = A.applicantid) As Answer2, (select answer from tblanswers Z where questionid = 3 and Z.applicantid = A.applicantid) As Answer3 FROM tblanswers A group by applicantid; sqlite> select * from allanswers; 100|stuff for answer one|stuff for answer two|stuff for answer three 200|random text one|random text two|random text three Has anybody come across a problem like this and found a reasonable dynamic solution? Even something like a trigger on tblQuestions to change the View would be great, I'm just not sure about the SQL involved. If it helps, I don't really care what the columns end up being named. Also I can't guarantee that the questionids will be consecutive or in any order. Thanks, Andrea ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] View with Dynamic Fields ?
Harold Wood wrote: >my approah would be a table for applicants, then the table for answers would have the applicantid and the questionid as well as the answer. > >CREATE TABLE tblApplicants (applicantid int, applicantname varchar(100)); >now just do a select joining the tables > >select ap.applicantname, qu.question, an.answer from tblanswers an inner join tblApplicants ap on ap.applicantid = an.applicantid inner join tblquestions qu on qu.questionid = an.questionid >-- optional where clause to select just 1 applicant where ap.applicantid = 1 >-- optional order by clause to make it neat order by applicantname asc, qu.questionid asc > > I am planning on having an applicant table (I guess I just left that to be assumed in the first post - sorry) but that isn't going to help my need at all. Your query gives me these results: andrea|whats up|stuff for answer one andrea|how are you|stuff for answer two andrea|whats your name|stuff for answer three todd|whats up|random text one todd|how are you|random text two todd|whats your name|random text three When what I want is this: 100|stuff for answer one|stuff for answer two|stuff for answer three 200|random text one|random text two|random text three I want one row with all answers for an applicant, and I want it to work for an arbitrary number of rows in tblQuestions or be able to update itself on an insert or delete. Any ideas? ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] View with Dynamic Fields ?
Thanks Chris & Dennis for the group_concat tip... It is an interesting idea but I don't think it makes sense to use this time. If I have to parse the results in my code, I might as well just get the answers in separate rows and group them together in code without having to worry about returning values for the non-answered questions. I knew this wouldn't be an easy thing to solve, but appreciate the input. I am still holding a shred of hope for a trigger that can recreate the view whenever the questions table is modified but I haven't put much thought into it yet and I'm sure it's just as challenging. I'm going to keep working on this and see if I can come up with anything. If anybody thinks of something, let me know. I know that I could do this in code, and at this point it would probably take less time, but I'd really like to see if this is possible more than anything. Thanks -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Tuesday, July 08, 2008 1:21 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] View with Dynamic Fields ? This gives the same result set you got and does not need to be edited: select applicantid, group_concat(answer, '|') from (select applicantid, answer from tblanswers order by questionid) group by applicantid; The group_concat() function is part of recent versions of SQLite. It is returning a single string rather than columns, but it can be parsed. The subquery ordering by questionid is needed to ensure that the columns (answers) from all respondents are in the same order. However, this will only work if tblanswers will always have one record for every question. That is, if your applicants skip one or more questions, you will still need to insert a record for skipped questionids, perhaps with a default answer like 'NOT ANSWERED'. Similarly, if you add new questions to tblquestions later, you will need to insert 'NO ANSWER' values into tblanswers for existing applicantids, othewise the columns (answers) will not align. Chris On Tue, 8 Jul 2008, Andrea Connell wrote: > I'm not sure if what I want is possible to do in SQL, but I -am- sure > that one of you will know. > Given two tables - one with questions to ask applicants and one with > an applicant's answer to a particular question - I want to make a > flattened view with all of an applicant's answers in one row. This is > easy enough to do when I know ahead of time which questions are in the > first table, but I don't like the hard-coded approach. > > My schema is as follows: > > CREATE TABLE tblquestions (questionid int, question varchar(100)); > INSERT INTO "tblquestions" VALUES(1,'whats up'); INSERT INTO > "tblquestions" VALUES(2,'how are you'); INSERT INTO "tblquestions" > VALUES(3,'whats your name'); CREATE TABLE tblanswers (questionid int, > applicantid int, answer varchar(2500)); INSERT INTO "tblanswers" > VALUES(1,100,'stuff for answer one'); INSERT INTO "tblanswers" > VALUES(2,100,'stuff for answer two'); INSERT INTO "tblanswers" > VALUES(3,100,'stuff for answer three'); INSERT INTO "tblanswers" > VALUES(1,200,'random text one'); INSERT INTO "tblanswers" > VALUES(2,200,'random text two'); INSERT INTO "tblanswers" > VALUES(3,200,'random text three'); > > > Here is the view I have come up with so far, which would require > editing whenever an insert or delete is done on tblQuestions. > > CREATE VIEW allanswers as > SELECT applicantid, > (select answer from tblanswers Z where questionid = 1 and > Z.applicantid = A.applicantid) As Answer1, > (select answer from tblanswers Z where questionid = 2 and > Z.applicantid = A.applicantid) As Answer2, > (select answer from tblanswers Z where questionid = 3 and > Z.applicantid = A.applicantid) As Answer3 FROM tblanswers A group by > applicantid; > > sqlite> select * from allanswers; > 100|stuff for answer one|stuff for answer two|stuff for answer three > 200|random text one|random text two|random text three > > > Has anybody come across a problem like this and found a reasonable > dynamic solution? Even something like a trigger on tblQuestions to > change the View would be great, I'm just not sure about the SQL > involved. If it helps, I don't really care what the columns end up > being named. Also I can't guarantee that the questionids will be > consecutive or in any order. > > Thanks, > Andrea > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] View with Dynamic Fields ?
Really? I never knew about that restriction before. Well, okay then. Everybody is telling me it is impossible, and that's kind of what I was thinking coming in to this so I guess that's that. Thanks! Andrea -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dennis Cote Sent: Tuesday, July 08, 2008 5:08 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] View with Dynamic Fields ? Andrea Connell wrote: > > I am still holding a shred of hope for a trigger that can recreate the > view whenever the questions table is modified > I wouldn't hold much hope for that. Triggers can only execute insert, update, delete, or select SQL statements. There is no way to execute a create table or create view command. Dennis Cote ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] View with Dynamic Fields ?
I'm not thinking big picture here. I just want this one view. I am not planning on using it in other queries, I just don't want to have to either group lots of data together or parse data apart in all of my projects that will be using these tables. I suppose that is what I will be doing though. It's not a big deal - I was just wondering if a simpler solution was available. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nicolas Williams Sent: Wednesday, July 09, 2008 2:41 PM To: General Discussion of SQLite Database Subject: Re: [sqlite] View with Dynamic Fields ? On Wed, Jul 09, 2008 at 12:32:27PM -0700, Andrea Connell wrote: > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Dennis Cote > > > > Andrea Connell wrote: > > > I am still holding a shred of hope for a trigger that can recreate > > > the view whenever the questions table is modified > > > > I wouldn't hold much hope for that. Triggers can only execute > > insert, update, delete, or select SQL statements. There is no way to > > execute a create table or create view command. > > Really? I never knew about that restriction before. Well, okay then. > Everybody is telling me it is impossible, and that's kind of what I > was thinking coming in to this so I guess that's that. Thanks! It's not just that it's impossible. Imagine you develop a dynamic SQL extension to SQLite (or any other open source SQL engine). Then what? How do you use the results of SQL queries with dynamic column counts in larger SQL queries? That's the hard part here. If you want "dynamic" column counts only for display purposes then you avoid that hard problem. But then, you might as well use group_concat() to get what you want, rather than work on extending the engine to support some notion of dynamic SQL. Nico -- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users