Hi all,
I've got a (very) basic Spark application in Python that selects some basic
information from my Phoenix table. I can't quite figure out how (or even if
I can) select dynamic columns through this, however.
Here's what I have;
from pyspark import SparkContext, SparkConf
from pyspark.sql
1. No, I am not confused. A skip scan would "skip" over entire ranges of obj_id
and all create_dt values for it. This will only be effective if there are many
less distinct values of obj_id than there are total rows. If there are too many
distinct obj_ids then it either wont speed the query up
Thanks for suggestion.
Here's further questions:
1. create_dt (not obj_id, I think you confused) would have large sets of
date, so SKIP_SCAN hint might be not useful.
2. I created secondary index on create_dt
create index IDX1_CREATE_DT on MY_TABLE(CREATE_DT;
However, EXPLAIN still shows
The missing quotes was the issue. That fixed it. Thanks!
On Wed, Feb 22, 2017 at 8:16 PM, Josh Elser wrote:
> Also, remember that Bash is going to interpret that semi-colon in your URL
> if you don't quote it. It will be treated as two separate commands:
>
>
Hi Arvind,
The row key is PARENTID, OWNERORGID, MILESTONETYPEID, PARENTTYPE
Each parentid will have a list of MILESTONETYPEID (19661, 1, 2 , etc..). So
your query will return all the parentids.. I am looking of rparentid that does
not have a MILESTONETYPEID
Thanks,
Pradheep
From: Arvind S
If there are not a large number of distinct values of obj_id, try a SKIP_SCAN
hint. Otherwise, the secondary index should work, make sure it's actually used
via explain. Finally, you might try the ROW_TIMESTAMP feature if it fits your
use case.
> On Feb 22, 2017, at 11:30 PM, NaHeon Kim
I believe the sequences track the current value of the sequence.
When your client requests 100 values, it would use 1-100, but Phoenix
only needs to know that the next value it can give out is 101. I'm not
100% sure, but I think this is how it works.
What are you concerned about?
Cheyenne