Amareshwari Sriramadasu created LENS-1333:
Summary: Add data completeness checker
Project: Apache Lens
Issue Type: New Feature
Reporter: Amareshwari Sriramadasu
Though lens has partition registration being done whenever data is available,
there is no guarantee the partition registered is complete. There can be
different ways to know if the data is complete for partition. One option could
be to have a partition property saying whether it is complete or not. Other
could be to do a http call to another hosted service and more.
Proposal here is to add an interface for DataCompletenessChecker and do the
check while resolving partitions.
Here are some of the capabilities we would like to add in Lens :
# Lens will check partition existence first, if it exists, then check the
completeness percentage. If the completeness percentage is less than a
configured threshold (default should be 98, 99 or even 100), Lens will fail the
# Lens's accept query on partial data will accept on incomplete data as well.
# Lens will also option to override the completeness percentage threshold value
at query level
# Lens will still have look ahead capability of daily being incomplete, then it
will union with hourly.
# If daily partitions exist (with no look ahead required), but they are
incomplete, lens can switch to hourly partitions and answer the query.
# If same measure is there in two different facts , Lens will we pick the one
with higher availability.
# In case of completeness percentage threshold missed, Lens will respond back
with available percentage.
This message was sent by Atlassian JIRA