Re: [GENERAL] Load a csv to remote postgresql database
On 3/1/2017 10:01 PM, priyanka raghav wrote: I am trying to load a csv file of approx 500mb to remote postgres database. Earlier when the app server and db server were co-located, COPY command was working fine but ever since the db server is moved to a different box, the command is failing. I understand that COPY command searches the file on database server which it is not able to find and hence is throwing an error. I cannot use /COPY as it is a psql command. What options do I have to go about this. I need an urgent answer to this. Thanks in advance. I recommend trying pgloader ... http://pgloader.io/ but if you have to do it inside a program, without shelling out to another program, then you'll have to figure out how to use the streaming interface in your chose database API, along with COPY tablename FROM STDIN [WITH options...];(note that does not actually mean its reading from STDIN)this what psql uses for \COPY, you then read the local CSV file and send it to the appropriate streaming interface, for instance if you're using libpq, you'd use... https://www.postgresql.org/docs/current/static/libpq-copy.html#LIBPQ-COPY-SEND -- john r pierce, recycling bits in santa cruz -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Load a csv to remote postgresql database
Hi 2017-03-02 7:01 GMT+01:00 priyanka raghav: > Hi, > > I am trying to load a csv file of approx 500mb to remote postgres > database. Earlier when the app server and db server were co-located, > COPY command was working fine but ever since the db server is moved to > a different box, the command is failing. I understand that COPY > command searches the file on database server which it is not able to > find and hence is throwing an error. I cannot use /COPY as it is a > psql command. What options do I have to go about this. I need an > urgent answer to this. Thanks in advance. > The psql COPY is special case for server side COPY. if you cannot to use a psql \copy, then you have to use a special API for you programming language that can to create a bridge between client and server. Why you cannot to use psql \copy ? Regards Pavel > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
[GENERAL] Load a csv to remote postgresql database
Hi, I am trying to load a csv file of approx 500mb to remote postgres database. Earlier when the app server and db server were co-located, COPY command was working fine but ever since the db server is moved to a different box, the command is failing. I understand that COPY command searches the file on database server which it is not able to find and hence is throwing an error. I cannot use /COPY as it is a psql command. What options do I have to go about this. I need an urgent answer to this. Thanks in advance. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Understanding pg_last_xlog_receive_location
On Thu, Mar 2, 2017 at 5:53 AM, Zach Waltonwrote: > I was able to test 9.4.11 and am seeing the same behavior: > > postgres=# SELECT pg_is_in_recovery(), pg_last_xlog_receive_location(), > pg_last_xlog_replay_location(); > pg_is_in_recovery | pg_last_xlog_receive_location | > pg_last_xlog_replay_location > ---+---+-- > t | | 0/3000198 Okay, you said that you are using here streaming replication, but the standby you are performing this query on seems just to be a hot standby recovering WAL from a WAL archive, not via streaming. I would bet that there is no WAL receiver running. pg_last_xlog_receive_location() get the last WAL position received from a streaming node, something that is set to NULL if there is no streaming happening, while pg_last_xlog_replay_location() is set by the startup process when replaying WAL records. Again I see no bugs here, you should check if a WAL receiver is running on this standby server. -- Michael -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Understanding pg_last_xlog_receive_location
I was able to test 9.4.11 and am seeing the same behavior: postgres=# SELECT pg_is_in_recovery(), pg_last_xlog_receive_location(), pg_last_xlog_replay_location(); pg_is_in_recovery | pg_last_xlog_receive_location | pg_last_xlog_replay_location ---+---+-- t | | 0/3000198 On Wed, Mar 1, 2017 at 11:17 AM, Zach Waltonwrote: > Thanks. We have some patches on the 9.4.5 code base (not in the > replication path). I'll work on porting those to 9.4.11 and will report > back to the thread. >
Re: [GENERAL] Re: Causeless CPU load waves in backend, on windows, 9.5.5 (EDB binary).
27.02.2017 10:08, I wrote: [...] So, what I've observed is that Wait* functions _usually_ go to sleep nicely when the state is not signalled, but _sometimes_, depending on unknown criteria, it can choose to instead do a busy-loop wait or something CPU-expensive. Maybe it tries to optimize the delay, or maybe it is a bug. The effect somewhat varies depending on windows version, CPU cores, selected system timer frequency, and Wait* call pattern (frequency). I can not currently see how it can be fixed in a generic and reliable way in postgres without sacrificing some performance. On the other hand, leaving it as-is is exposing the server to a substantial CPU abuse (even unintentional, as it initially happened in my case). Ok, I've finally found what happened. False alarm, in short. The server is perfectly fine. My mistake was that I was using most basic ways of monitoring CPU load, i.e. provided by GetProcessTimes() and taskmgr.exe and such. And because they only operate at the scheduler granularity (usually 16ms approx), some rounding has to happen in cpu consumption calculation. With this approach, this rounding can apparently distort the measurement dramatically (in some corner cases). Now, xperf employs a more fair approach to cpu consumption measurement, based on all the exact times of process switching-in and -out, which apparently gives a much better result. So, in my case the results of taskmgr and xperf are very different, and considering the above, taskmgr's report should be dismissed as bogus. And xperf indicates that my stress-tests actually do not cause postgres backend process to consume any noticable cpu percentage al all. That's good. Regards, Nikolai Regards, Nikolai -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Understanding pg_last_xlog_receive_location
Thanks. We have some patches on the 9.4.5 code base (not in the replication path). I'll work on porting those to 9.4.11 and will report back to the thread.
Re: [GENERAL] disk writes within a transaction
On Feb 17, 2017, at 4:05 PM, Jeff Janes wrote: > It will probably be easier to refactor the code than to quantify just how > much damage it does. Thanks for all the info. It looks like this is something worth prioritizing because of the effects on indexes. We had discussed a fix and pointed it; rewriting the code that causes this is pretty massive, and will require blocking out a resource FT for 2 weeks on rewrites and testing. We don't really have time to spare any of those devs, so time to make product tradeoffs ;( -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] appropriate column for storing ipv4 address
> On Mar 1, 2017, at 8:39 AM, jonathan vanascowrote: > > > I have to store/search some IP data in Postgres 9.6 and am second-guessing my > storage options. > > > The types of searching I'm doing: [...] > > 2. on tracked_ip_block, i search/join against the tracked_ip_address to > show known ips in a block, or a known block for an ip. > > i used cidr instead of inet for the ip_address because it saved me a cast on > joins and appears to work the same. was that the right move? is there a > better option? If you're looking to do fast searches for "is this IP address in any of these CIDR blocks" you might want to look at https://github.com/RhodiumToad/ip4r as a possible alternative. Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] appropriate column for storing ipv4 address
On 03/01/2017 08:39 AM, jonathan vanasco wrote: I have to store/search some IP data in Postgres 9.6 and am second-guessing my storage options. Would anyone mind giving this a quick look for me? Right now I have two tables, and am just using cidr for both: Hi Jonathan, CIDR seems like a better match to how people think about IPs, but another option would be to use a custom range type on inet. I wrote a blog post about that here, including how to use a GiST index to get fast searches: http://illuminatedcomputing.com/posts/2016/06/inet-range/ Maybe it will be useful for you! Or maybe there is already some built-in way to treat cidr columns like ranges? Paul -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] appropriate column for storing ipv4 address
I have to store/search some IP data in Postgres 9.6 and am second-guessing my storage options. Would anyone mind giving this a quick look for me? Right now I have two tables, and am just using cidr for both: create table tracked_ip_address ( id SERIAL primary key, ip_address CIDR not null ); create table tracked_ip_block ( id SERIAL primary key, block_cidr CIDR not null, ownserhip_data TEXT ); The types of searching I'm doing: 1. on tracked_ip_address, I'll search for neighboring ips. e.g. select * from tracked_ip_address where ip_address << '192.168'::CIDR; select * from tracked_ip_address where ip_address << '192.168.1'::CIDR; 2. on tracked_ip_block, i search/join against the tracked_ip_address to show known ips in a block, or a known block for an ip. i used cidr instead of inet for the ip_address because it saved me a cast on joins and appears to work the same. was that the right move? is there a better option? thanks in advance. / jonathan -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is (not) distinct from
Adrian Klaverwrites: > Where I am going with this, is that it is not clear to me how you are > matching the two sets of records to determine whether they are different > or not. He's not. The query is forming the cartesian product of the two tables and then dropping join rows where the tables match ... but every B row is going to have multiple A rows where it doesn't match, and those join rows will all survive the WHERE. Then "select distinct" gets rid of the duplicates, and since nothing from A is presented in the result, it's not very obvious what's happening. This is a great example of "select distinct" being used as a band-aid over a fundamental misunderstanding of SQL. It's good advice to never use "distinct" unless you know exactly why your query is generating duplicate rows in the first place. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] is (not) distinct from
On 03/01/2017 12:15 AM, Johann Spies wrote: On 28 February 2017 at 17:06, Adrian Klaver> wrote: I have not worked through all this but at first glance I suspect: select distinct b.* from b ... is distinct from ... constitutes a double negative. What happens if you eliminate the first distinct? Thanks Adrian, The dynamics of the data has changed because of data updates so an exact comparison is not possible. Other tests now confirm that the 28 records are identical in both tables. The results then become more confusing: If I remove the first distinct and use "is distinct from" I get 756 rows and when I use "is not distinct from" I get 28. In the first (756) case when I use "group by" the result of the first query is exactly the same as the second one. To be clear you are looking for records in citation that are different from citationbackup over a subset(Are there more fields?) of 8 fields, correct? What do those 8 fields represent? Is citationbackup really a backup of citation? Is there a Primary Key on either/both tables? What are you grouping by? Where I am going with this, is that it is not clear to me how you are matching the two sets of records to determine whether they are different or not. Your result that yields 756 rows indicates that the comparison is not an apples to apples comparison, but a comparison of two 'shuffled' sets. Adding the group by seems to sort that out. So some idea of what constitutes a difference and how you determine which records from each table you want to match would be helpful. If you could show the table schema and some sample data it would be even better. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3) -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] ERROR: functions in index expression must be marked IMMUTABLE
On 01.03.2017 14:40, Geoff Winkless wrote: On 1 March 2017 at 13:36, Sven R. Kunze>wrote: On 28.02.2017 17:50, David G. Johnston wrote: Supposedly one could provide a version of to_date that accepts a locale in which to interpret names in the input data - or extend the format string with some kind of "{locale=en_US}" syntax to avoid changing the function signature. if this is acceptable, I would be willing to contribute a patch to enable exactly this. I don't see how that would help. You can't set a function to be immutable for "some" inputs (ie where locale is specified in the format string). The only way to do it would be to add to_date(string, string, string) where the third string specifies the locale, at which point I don't really see why you would gain anything over creating your own UDF. I don't consider rolling an UDF the best alternative especially after having looked through many solution proposals on the Web which just take an mutable expression and wrap them up in an immutable function. An additional variant of to_date/to_timestamp would have the following advantages (point 2 is most important): 1) a single, recommended and easy way to parse date(times) 2) make people aware of the locale/timezone issue but give them a standard tool to solve it 3) make all those and related Google entries (https://www.google.de/webhp?sourceid=chrome-instant=1=2=UTF-8#q=postgresql+to_date+immutable&*) point to the same and safe solution eventually Sven
Re: [GENERAL] ERROR: functions in index expression must be marked IMMUTABLE
On 1 March 2017 at 13:36, Sven R. Kunzewrote: > On 28.02.2017 17:50, David G. Johnston wrote: > > Supposedly one could provide a version of to_date that accepts a locale in > which to interpret names in the input data - or extend the format string > with some kind of "{locale=en_US}" syntax to avoid changing the function > signature. > > if this is acceptable, I would be willing to contribute a patch to enable > exactly this. > I don't see how that would help. You can't set a function to be immutable for "some" inputs (ie where locale is specified in the format string). The only way to do it would be to add to_date(string, string, string) where the third string specifies the locale, at which point I don't really see why you would gain anything over creating your own UDF. Geoff
Re: [GENERAL] ERROR: functions in index expression must be marked IMMUTABLE
On 28.02.2017 17:50, David G. Johnston wrote: That would seem to be it. cache_locale_time() at the top of DCH_to_char which is in the call stack of the shared parsing code for both to_date and to_timestamp. https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/formatting.c;h=e552c8d20b61a082049068d2f8d776e35fef1179;hb=HEAD#l2363 Supposedly one could provide a version of to_date that accepts a locale in which to interpret names in the input data - or extend the format string with some kind of "{locale=en_US}" syntax to avoid changing the function signature. David J. I don't know if this is the right way and list to ask for this: But if this is acceptable, I would be willing to contribute a patch to enable exactly this. Regards, Sven
Re: [GENERAL] is (not) distinct from
On 28 February 2017 at 17:06, Adrian Klaverwrote: > > I have not worked through all this but at first glance I suspect: > > select distinct b.* from b ... > > is distinct from ... > > constitutes a double negative. > > What happens if you eliminate the first distinct? > > > > Thanks Adrian, The dynamics of the data has changed because of data updates so an exact comparison is not possible. Other tests now confirm that the 28 records are identical in both tables. The results then become more confusing: If I remove the first distinct and use "is distinct from" I get 756 rows and when I use "is not distinct from" I get 28. In the first (756) case when I use "group by" the result of the first query is exactly the same as the second one. Regards Johann -- Because experiencing your loyal love is better than life itself, my lips will praise you. (Psalm 63:3)