Gagan - did you see any error in HAWQ logs or Hadoop logs? This doesn't look like a PXF issue.
On Fri, Mar 11, 2016 at 8:59 AM Gagan Brahmi <[email protected]> wrote: > And this is what I mean when I say that fetch (select) seems to be > working. I place a dummy file to run select query and it provides the > results. > > hdfs@my-hadoop-cluster:~> cat /tmp/test_fetch > 100 | Random Value > 101 | Another Random > hdfs@my-hadoop-cluster:~> hadoop fs -put /tmp/test_fetch /tmp/foo_bar/ > hdfs@suse11-workplace:~> logout > my-hadoop-cluster:~ # su - gpadmin > gpadmin@my-hadoop-cluster:~> source /usr/local/hawq/greenplum_path.sh > gpadmin@my-hadoop-cluster:~> psql -p 10432 gagan > psql (8.2.15) > Type "help" for help. > > gagan=# SELECT * FROM ext_get_foo ; > i | bar > -----+----------------- > 100 | Random Value > 101 | Another Random > (2 rows) > > gagan=# > > > Table DDLs > > gagan=# CREATE WRITABLE EXTERNAL TABLE ext_put_foo (i int, bar text) > LOCATION > ('pxf://my-hadoop-cluster:51200/tmp/foo_bar?profile=HdfsTextSimple') > FORMAT 'text' (delimiter '|' null 'null'); > CREATE EXTERNAL TABLE > gagan=# CREATE EXTERNAL TABLE ext_get_foo (i int, bar text) LOCATION > ('pxf://my-hadoop-cluster:51200/tmp/foo_bar?profile=HdfsTextSimple') > FORMAT 'text' (delimiter '|' null 'null'); > CREATE EXTERNAL TABLE > gagan=# INSERT into ext_put_foo VALUES (1, 'Gagan'); > ERROR: failed sending to remote component (libchurl.c:574) (seg0 > my-hadoop-cluster:40000 pid=824) (dispatcher.c:1753) > > > Regards, > Gagan Brahmi > > On Fri, Mar 11, 2016 at 9:53 AM, Gagan Brahmi <[email protected]> > wrote: > > Nothing in the pxf-service.log or the catalina.out for pxf service. > > > > It has the normal startup messages while the webapp starts up. I did > > modified the overcommit memory to 1 and restarted all the service > > (just in case). But that still didn't seem to have made any > > difference. > > > > I still see the file is closed by DFSClient message in HDFS everytime > > I try to run an insert command. The select looks to be working fine. > > > > > > Regards, > > Gagan Brahmi > > > > On Fri, Mar 11, 2016 at 9:21 AM, Daniel Lynch <[email protected]> wrote: > >> check the pxf service logs for errors. I suspect there is an out of > memory > >> event at some point during the connection considering this is a single > node > >> deployment. > >> > >> Also make sure overcommit is disabled to prevent virtual mem OOM errors. > >> This of course would not be recommended in production but for single > node > >> deployments you will need this setting. > >> echo 1 > /proc/sys/vm/overcommit_memory > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> Daniel Lynch > >> Mon-Fri 9-5 PST > >> Office: 408 780 4498 > >> > >> On Fri, Mar 11, 2016 at 2:01 AM, Gagan Brahmi <[email protected]> > wrote: > >> > >>> This a standalone box with no ha for hdfs. > >>> > >>> I haven't enabled the ha properties in hawq site. > >>> > >>> Regards, > >>> Gagan > >>> On Mar 11, 2016 00:56, "Leon Zhang" <[email protected]> wrote: > >>> > >>> > Hi, Gagang > >>> > > >>> > It seems you use HA hdfs cluster? I am not sure if HAWQ can work like > >>> > this. Can any HAWQ developer clarify this condition? > >>> > If so, you can try a non-HA hdfs cluster with direct IP access. All > PXF > >>> > services are working perfect here. > >>> > > >>> > > >>> > On Fri, Mar 11, 2016 at 10:25 AM, Gagan Brahmi < > [email protected]> > >>> > wrote: > >>> > > >>> >> Thank you Ting! > >>> >> > >>> >> That was the problem. It seemed to have worked, but now I am stuck > >>> >> with a different error. > >>> >> > >>> >> gagan=# INSERT into ext_put_foo VALUES (1, 'Gagan'); > >>> >> ERROR: failed sending to remote component (libchurl.c:574) (seg0 > >>> >> my-hadoop-cluster:40000 pid=24563) (dispatcher.c:1753) > >>> >> > >>> >> This certainly mean that the back ground service has stopped serving > >>> >> connection for some reason. > >>> >> > >>> >> I check the namenode and find this. > >>> >> > >>> >> 2016-03-10 19:28:11,759 INFO hdfs.StateChange > >>> >> (FSNamesystem.java:completeFile(3503)) - DIR* completeFile: > >>> >> /tmp/foo_bar/1350_0 is closed by > DFSClient_NONMAPREDUCE_-244490296_23 > >>> >> > >>> >> I have a single node installation with a HDFS replication factor of > 1 > >>> >> (both in hdfs-site and hdfs-client for hawq). > >>> >> > >>> >> I have also tried to update the connectTimeout value to 60 secs in > the > >>> >> server.xml file for pxf webapp. > >>> >> > >>> >> A normal write to HDFS works fine. I see file being created in the > >>> >> directory foor_bar but are 0 bytes in size. > >>> >> > >>> >> -rw-r--r-- 1 pxf hdfs 0 2016-03-10 19:08 > /tmp/foo_bar/1336_0 > >>> >> -rw-r--r-- 1 pxf hdfs 0 2016-03-10 19:27 > /tmp/foo_bar/1349_0 > >>> >> -rw-r--r-- 1 pxf hdfs 0 2016-03-10 19:28 > /tmp/foo_bar/1350_0 > >>> >> > >>> >> Not sure if someone has encountered this before. Would appreciate > any > >>> >> inputs. > >>> >> > >>> >> > >>> >> Regards, > >>> >> Gagan Brahmi > >>> >> > >>> >> On Thu, Mar 10, 2016 at 11:45 AM, Ting(Goden) Yao <[email protected]> > >>> >> wrote: > >>> >> > Your table definition: > >>> >> > ('pxf://my-hadoop-cluster:*50070*/foo_bar?profile=HdfsTextSimple') > >>> >> > if you installed pxf on 51200, you need to use the port 51200 > >>> >> > > >>> >> > > >>> >> > On Thu, Mar 10, 2016 at 10:34 AM Gagan Brahmi < > [email protected]> > >>> >> wrote: > >>> >> > > >>> >> >> Hi Team, > >>> >> >> > >>> >> >> I was wondering if someone has encountered this problem before. > >>> >> >> > >>> >> >> While trying to work with PXF on hawq 2.0 I am encountering the > >>> >> following > >>> >> >> error: > >>> >> >> > >>> >> >> gagan=# CREATE EXTERNAL TABLE ext_get_foo (i int, bar text) > LOCATION > >>> >> >> ('pxf://my-hadoop-cluster:50070/foo_bar?profile=HdfsTextSimple') > >>> >> >> FORMAT 'text' (delimiter '|' null 'null'); > >>> >> >> > >>> >> >> gagan=# SELECT * FROM ext_get_foo ; > >>> >> >> ERROR: remote component error (404): PXF service could not be > >>> >> >> reached. PXF is not running in the tomcat container > (libchurl.c:878) > >>> >> >> > >>> >> >> The same happens when I try to write to an external table using > PXF. > >>> >> >> > >>> >> >> I believe the above error signifies that PXF service isn't > running or > >>> >> >> unavailable. But PXF is running on port 51200. > >>> >> >> > >>> >> >> Curl response works fine as well: > >>> >> >> > >>> >> >> # curl -s http://localhost:51200/pxf/v0 > >>> >> >> Wrong version v0, supported version is v14 > >>> >> >> > >>> >> >> PXF is build using gradlew and installed as RPM files. I also > have > >>> >> >> tomcat 7.0.62 installed with the PXF packages. > >>> >> >> > >>> >> >> The following is how PXF is running on the instance: > >>> >> >> > >>> >> >> pxf 21405 0.3 2.8 825224 115164 ? Sl 02:07 0:10 > >>> >> >> /usr/java/latest/bin/java > >>> >> >> > >>> >> >> > >>> >> > >>> > -Djava.util.logging.config.file=/var/pxf/pxf-service/conf/logging.properties > >>> >> >> -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager > >>> >> >> -Xmx512M -Xss256K > -Djava.endorsed.dirs=/var/pxf/pxf-service/endorsed > >>> >> >> -classpath > >>> >> >> > >>> >> > >>> > /var/pxf/pxf-service/bin/bootstrap.jar:/var/pxf/pxf-service/bin/tomcat-juli.jar > >>> >> >> -Dcatalina.base=/var/pxf/pxf-service > >>> >> >> -Dcatalina.home=/var/pxf/pxf-service > >>> >> >> -Djava.io.tmpdir=/var/pxf/pxf-service/temp > >>> >> >> org.apache.catalina.startup.Bootstrap start > >>> >> >> > >>> >> >> I do not have apache-tomcat running. Not sure how are the two > >>> >> >> interrelated. But the RPM file created by gradlew requires > tomcat for > >>> >> >> pxf-service. > >>> >> >> > >>> >> >> I would appreciate any inputs into this problem. > >>> >> >> > >>> >> >> > >>> >> >> Regards, > >>> >> >> Gagan Brahmi > >>> >> >> > >>> >> > >>> > > >>> > > >>> >
