It was probably lost in the wall of text, but the point I was trying to convey was that fetching all rows via protobuf crashed PQS, while fetching it via JSON didn't. Basically, the only way I can get protobuf serialization to work is to 1. bump up the PQS memory and/or 2. fetch in batches, while I don't have to do either with JSON.
- Manoj On Fri, Nov 9, 2018 at 10:59 AM Josh Elser <els...@apache.org> wrote: > Manoj, > > re: #2 Please start by reading: > https://plumbr.io/outofmemoryerror/gc-overhead-limit-exceeded > > This should give you a thorough explanation of what this error means. > You need to increase the memory footprint and/or JVM GC properties for > PQS to address your issue. When you request all of the rows at once, of > course PQS needs to hold those all in memory. Please see my earlier > instructions on how you do set JVM properties for PQS. > > re: #3 > > I don't understand what you mean by "need to make protobuf faster". Just > telling me what you observe from a high-elevation doesn't help me give > you any kind of recommendations. I would imagine that you first need to > solve the above issue, before you start requesting large batches of > records from PQS (with *either* serialization). > > It sounds like the JSON serialization in the Python driver does not > match what PQS/Avatica currently expects. Avatica provides no guarantee > of wire compatibility for JSON, so you may just be hitting code that > needs to be updated. > > On 11/8/18 7:14 PM, Manoj Ganesan wrote: > > Thanks for the pointers Josh. > > > > Here’s a set of instructions to demonstrate the performance issues I’m > > seeing when using protobuf serialization. Hopefully these are > > comprehensive enough to reproduce the issue. If not, please let me know > > and I’m happy to provide clarifications. > > > > Basic outline of the instructions: > > 1. Create and populate a test table > > 2. Use the python adapter to fetch rows > > 3. JSON serialization > > > > 1. Create and populate a test table > > > > 1.1. Create a table: > > > > $> /usr/lib/phoenix/bin/sqlline.py localhost > > CREATE TABLE test_perf (ind_id integer not null, a tinyint , b tinyint , > > c varchar(12), constraint pk primary key (ind_id)); > > > > 1.2. I wrote this simple python script to populate the table with about > > 6 million rows: > > > > # data_generator.py > > import string > > import csv > > import random > > > > for i in range(0, 6000000): > > print ','.join([str(i), str(random.randint(-127, 127)), > > str(random.randint(-127, 127)), > > ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in > > range(12))]) > > > > # run the script > > $> python data_generator.py > in.csv > > > > 1.3. Import rows into the table > > > > $> /opt/phoenix/bin/psql.py -t TEST_PERF <cluster_address> in.csv > > > > 2. Use the python adapter to fetch rows > > > > 2.1. Install the Python adapter > > > > The instructions are detailed on the project README > > (https://github.com/apache/phoenix/tree/master/python/phoenixdb). > > > > pip install phoenixdb > > > > 2.2. Write a script to fetch rows > > > > import phoenixdb > > from time import time > > > > database_url = ‘<your_pqs_url>’ > > sql = 'select ind_id, a, b, c from test_perf' > > > > conn = phoenixdb.connect(database_url, autocommit=True) > > > > print 'connecting to %s' % database_url > > cursor = conn.cursor() > > cursor.itersize = -2 # to fetch all rows at once > > s = time() > > print 'executing query %s' % sql > > cursor.execute() > > print 'time to execute = %s' % (time() - s) > > > > s = time() > > print 'fetching results' > > results = cursor.fetchall() > > print 'time to fetchall = %s' % (time() - s) > > > > 2.3. Execute the fetch script > > > > When I execute the above script to fetch all rows at once (i.e., using > > cursor.itersize = -2), PQS crashes and returns and error: > > > > Traceback (most recent call last): > > File "queries/protobuf_client.py", line 12, in <module> > > cursor = conn.cursor() > > File > > > "/Users/mganesan/ttam/local-scripts/venv/lib/python2.7/site-packages/phoenixdb/cursor.py", > > > line 201, in execute > > operation, first_frame_max_size=self.itersize) > > File > > > "/Users/mganesan/ttam/local-scripts/venv/lib/python2.7/site-packages/phoenixdb/avatica.py", > > > line 405, in prepare_and_execute > > response_data = self._apply(request, 'ExecuteResponse') > > File > > > "/Users/mganesan/ttam/local-scripts/venv/lib/python2.7/site-packages/phoenixdb/avatica.py", > > > line 213, in _apply > > parse_error_page(response_body) > > File > > > "/Users/mganesan/ttam/local-scripts/venv/lib/python2.7/site-packages/phoenixdb/avatica.py", > > > line 117, in parse_error_page > > raise errors.InternalError(message) > > phoenixdb.errors.InternalError: ('GC overhead limit exceeded', None, > > None, None) > > > > If I fetch it 2000 rows at a time (i.e., cursor.itersize = 2000, which > > is also the default), the fetch finishes in 360s. > > > > 3. JSON serialization > > > > One of the issues I mentioned in my previous email was about how JSON > > serialization seems to be not supported via the Python client even > > though I was making the queries as per the Avatica JSON reference > > (https://calcite.apache.org/avatica/docs/json_reference.html). The > > client I wrote always received 0 rows, but no errors. On digging through > > the source a bit, I realized that with the version being used within > > Phoenix (the latest?), 2 new arguments are expected with the > > ‘prepareAndExecute’ method: ‘maxRowsTotal’ and ‘maxRowsInFirstFrame’. > > Essentially the request should look like the following (note the last 2 > > args): > > { > > 'request': 'prepareAndExecute', > > 'sql': 'select ind_id, a, b, c from test_perf’, > > 'connectionId': '2747c631-440e-405b-8185-44d1cc7f266c', > > 'statementId': 1, > > 'maxRowCount': -2, > > 'maxRowsTotal': -2, > > 'maxRowsInFirstFrame': -2, > > } > > > > “sqlline-thin.json -s JSON” works because the Avatica client driver > > Phoenix/Sqlline uses makes the right HTTP calls, with the correct args. > > > > The response time with using JSON serialization and fetching *all* rows > > (cursor.itersize = -2) was 60s, which is much faster than the 360s that > > it takes with protobuf when passing cursor.itersize=2K (note that > > protobuf did not work with fetching all rows at once, i.e., > > cursor.itersize=-2) > > > > Because of these issues, for now we have switched PQS to using JSON > > serialization and updated our clients to use the same. We’re obviously > > very much interested in understanding how the protobuf path can be made > > faster. > > > > Thanks again for the help! > > Manoj > > > > On Tue, Nov 6, 2018 at 9:51 AM Josh Elser <els...@apache.org > > <mailto:els...@apache.org>> wrote: > > > > > > > > On 11/5/18 10:10 PM, Manoj Ganesan wrote: > > > Thanks for the pointers Josh. I'm working on getting a > > representative > > > concise test to demonstrate the issue. > > > > > > Meanwhile, I had one question regarding the following: > > > > > > You are right that the operations in PQS should be exactly > > the same, > > > regardless of the client you're using -- that is how this > > > architecture works. > > > > > > > > > IIUC, this means the following 2 methods should yield the same > > result: > > > > > > 1. sqlline-thin.py -s JSON <query_file> > > > 2. using a python avatica client script making JSON requests > > > > That's correct. Any client which speaks to PQS should see the same > > results. There may be bugs in the client implementation, of course, > > which make this statement false. > > > > > I made the following change in hbase-site.xml on the PQS host: > > > > > > <property> > > > <name>phoenix.queryserver.serialization</name> > > > <value>JSON</value> > > > </property> > > > > > > I notice that executing "sqlline-thin.py -s JSON <query_file>" > > returns > > > results just fine. However, when I use a simple script to try the > > same > > > query, it returns 0 rows. I'm attaching the Python script here. > The > > > script essentially makes HTTP calls using the Avatica JSON > reference > > > <https://calcite.apache.org/avatica/docs/json_reference.html>. I > > assumed > > > that the sqlline-thin wrapper (when passed the -s JSON flag) also > > make > > > HTTP calls based on the JSON reference, is that not correct? > > > > Apache mailing lists strip attachments. Please consider hosting it > > somewhere else, along with instructions/scripts to generate the > > required > > tables. Please provide some more analysis of the problem than just a > > summarization of what you see as an end-user -- I don't have the > cycles > > or interest to debug the entire system for you :) > > > > Avatica is a protocol that interprets JDBC using some serialization > > (JSON or Protobuf today) and a transport (only HTTP) to a remote > server > > to run the JDBC oeprations. So, yes: an Avatica client is always > using > > HTTP, given whatever serialization you instruct it to use. > > > > > I'll work on getting some test cases here soon to illustrate this > as > > > well as the performance problem. > > > > > > Thanks again! > > > Manoj > > > > > > On Mon, Nov 5, 2018 at 10:43 AM Josh Elser <els...@apache.org > > <mailto:els...@apache.org> > > > <mailto:els...@apache.org <mailto:els...@apache.org>>> wrote: > > > > > > Is the OOME issue regardless of using the Java client > > (sqlline-thin) > > > and > > > the Python client? I would like to know more about this one. > > If you can > > > share something that reproduces the problem for you, I'd like > > to look > > > into it. The only suggestion I have at this point in time is > > to make > > > sure you set a reasonable max-heap size in hbase-env.sh (e.g. > > -Xmx) via > > > PHOENIX_QUERYSERVER_OPTS and have HBASE_CONF_DIR pointing to > > the right > > > directory when you launch PQS. > > > > > > Regarding performance, as you've described it, it sounds like > the > > > Python > > > driver is just slower than the Java driver. You are right > > that the > > > operations in PQS should be exactly the same, regardless of > > the client > > > you're using -- that is how this architecture works. Avatica > > is a wire > > > protocol that all clients use to talk to PQS. More > > digging/information > > > you can provide about the exact circumstances (and, again, > > > steps/environment to reproduce what you see) would be > > extremely helpful. > > > > > > Thanks Manoj. > > > > > > - Josh > > > > > > On 11/2/18 7:16 PM, Manoj Ganesan wrote: > > > > Thanks Josh for the response! > > > > > > > > I would definitely like to use protobuf serialization, but > I'm > > > observing > > > > performance issues trying to run queries with a large > > number of > > > results. > > > > One problem is that I observe PQS runs out of memory, when > its > > > trying to > > > > (what looks like to me) serialize the results in Avatica. > The > > > other is > > > > that the phoenixdb python adapter itself spends a large > > amount of > > > time > > > > in the logic > > > > > > > > > < > https://github.com/apache/phoenix/blob/master/python/phoenixdb/phoenixdb/cursor.py#L248 > > > > > > > > > where its converting the protobuf rows to python objects. > > > > > > > > Interestingly when we use sqlline-thin.py instead of python > > > phoenixdb, > > > > the protobuf serialization works fine and responses are > fast. > > > It's not > > > > clear to me why PQS would have problems when using the > python > > > adapter > > > > and not when using sqlline-thin, do they follow different > > code paths > > > > (especially around serialization)? > > > > > > > > Thanks again, > > > > Manoj > > > > > > > > On Fri, Nov 2, 2018 at 4:05 PM Josh Elser > > <els...@apache.org <mailto:els...@apache.org> > > > <mailto:els...@apache.org <mailto:els...@apache.org>> > > > > <mailto:els...@apache.org <mailto:els...@apache.org> > > <mailto:els...@apache.org <mailto:els...@apache.org>>>> wrote: > > > > > > > > I would strongly suggest you do not use the JSON > > serialization. > > > > > > > > The JSON support is implemented via Jackson which has > no > > > means to make > > > > backwards compatibility "easy". On the contrast, > protobuf > > > makes this > > > > extremely easy and we have multiple examples over the > past > > > years where > > > > we've been able to fix bugs in a backwards compatible > > manner. > > > > > > > > If you want the thin client to continue to work across > > > versions, stick > > > > with protobuf. > > > > > > > > On 11/2/18 5:27 PM, Manoj Ganesan wrote: > > > > > Hey everyone, > > > > > > > > > > I'm trying to use the Python phoenixdb adapter work > > with JSON > > > > > serialization on PQS. > > > > > > > > > > I'm using Phoenix 4.14 and the adapter works fine > with > > > protobuf, but > > > > > when I try making it work with an older version of > > phoenixdb > > > > (before the > > > > > JSON to protobuf switch was introduced), it just > > returns 0 > > > rows. > > > > I don't > > > > > see anything in particular wrong with the HTTP > requests > > > itself, > > > > and they > > > > > seem to conform to the Avatica JSON spec > > > > > > > (http://calcite.apache.org/avatica/docs/json_reference.html). > > > > > > > > > > Here's the result (with some debug statements) that > > > returns 0 rows. > > > > > Notice the > > > *"firstFrame":{"offset":0,"done":true,"rows":[]* below: > > > > > > > > > > request body = {"maxRowCount": -2, "connectionId": > > > > > "68c05d12-5770-47d6-b3e4-dba556db4790", "request": > > > > "prepareAndExecute", > > > > > "statementId": 3, "sql": "SELECT col1, col2 from > table > > > limit 20"} > > > > > request headers = {'content-type': > 'application/json'} > > > > > _post_request: got response {'fp': > <socket._fileobject > > > object at > > > > > 0x7f858330b9d0>, 'status': 200, 'will_close': False, > > > 'chunk_left': > > > > > 'UNKNOWN', 'length': 1395, 'strict': 0, 'reason': > 'OK', > > > > 'version': 11, > > > > > 'debuglevel': 0, 'msg': <httplib.HTTPMessage > > instance at > > > > > 0x7f84fb50be18>, 'chunked': 0, '_method': 'POST'} > > > > > response.read(): body = > > > > > > > > > > > > > > > > {"response":"executeResults","missingStatement":false,"rpcMetadata":{"response":"rpcMetadata","serverAddress":"ip-10-55-6-247:8765"},"results":[{"response":"resultSet","connectionId":"68c05d12-5770-47d6-b3e4-dba556db4790","statementId":3,"ownStatement":true,"signature":{"columns":[{"ordinal":0,"autoIncrement":false,"caseSensitive":false,"searchable":true,"currency":false,"nullable > > > > > > > > > > > > > > > > ":0,"signed":true,"displaySize":40,"label":"COL1","columnName":"COL1","schemaName":"","precision":0,"scale":0,"tableName":"TABLE","catalogName":"","type":{"type":"scalar","id":4,"name":"INTEGER","rep":"PRIMITIVE_INT"},"readOnly":true,"writable":false,"definitelyWritable":false,"columnClassName":"java.lang.Integer"},{"ordinal":1,"autoIncrement":false,"caseSensitive":false,"searchable":true,"currency":false,"nullable":0,"signed":true,"displaySize":40,"label":"COL2","columnName":"COL2","schemaName":"","precision":0,"scale":0,"tableName":"TABLE","catalogName":"","type":{"type":"scalar","id":4,"name":"INTEGER","rep":"PRIMITIVE_INT"},"readOnly":true,"writable":false,"definitelyWritable":false,"columnClassName":"java.lang.Integer"}],"sql":null,"parameters":[],"cursorFactory":{"style":"LIST","clazz":null,"fieldNames":null},"statementType":null},*"firstFrame":{"offset":0,"done":true,"rows":[]*},"updateCount":-1,"rpcMetadata":{"response":"rpcMetadata","serverAddress":"ip-10-55-6-247:8765"}}]} > > > > > > > > > > > > > > > > > > > The same query issued against a PQS started with > > PROTOBUF > > > > serialization > > > > > and using a newer phoenixdb adapter returns the > correct > > > number of > > > > rows. > > > > > > > > > > Has anyone had luck making this work? > > > > > > > > > > Thanks, > > > > > Manoj > > > > > > > > > > > > > > >