[jira] [Updated] (ATLAS-4808) Automatic data classification Support by Atlas

2023-11-08 Thread Jagadesh Kiran N (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated ATLAS-4808:

Description: 
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of the column contains 
PII Data. Can Atlas automatically scan data inside and if it finds any PII 
information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

 

  was:
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of the column contains 
PII Data. Can Atlas automatically scan data inside and if it finds any PII 
information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

b. Also do atlas support include / exclude option for tables , topics etc ?


> Automatic data classification Support by Atlas
> --
>
> Key: ATLAS-4808
> URL: https://issues.apache.org/jira/browse/ATLAS-4808
> Project: Atlas
>  Issue Type: Wish
>Reporter: Jagadesh Kiran N
>Priority: Minor
>
> a.   We are maintaining a Datalake , Data we store in folders in HDFS and 
> from there hive takes the data and stores in tables.
> In one of the Hive table we have bunch of columns. one of the column contains 
> PII Data. Can Atlas automatically scan data inside and if it finds any PII 
> information ( like name, email , phone number etc ) ,
> It needs to mark data classification( classify columns in attribute )  for 
> that object as PII  and then propagate if any other child objects created 
> from that by asynchronously scan data automatically.
> The Scan process classify a column as PII or relevant info.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ATLAS-4808) Automatic data classification Support by Atlas

2023-11-06 Thread Jagadesh Kiran N (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated ATLAS-4808:

Description: 
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of the column contains 
PII Data. Can Atlas automatically scan data inside and if it finds any PII 
information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

b. Also do atlas support include / exclude option for tables , topics etc ?

  was:
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of the column contains 
PII Data. Can Atlas product will automatically scan data inside and if it finds 
any PII information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

b. Also do atlas support include / exclude option for tables , topics etc ?


> Automatic data classification Support by Atlas
> --
>
> Key: ATLAS-4808
> URL: https://issues.apache.org/jira/browse/ATLAS-4808
> Project: Atlas
>  Issue Type: Wish
>Reporter: Jagadesh Kiran N
>Priority: Minor
>
> a.   We are maintaining a Datalake , Data we store in folders in HDFS and 
> from there hive takes the data and stores in tables.
> In one of the Hive table we have bunch of columns. one of the column contains 
> PII Data. Can Atlas automatically scan data inside and if it finds any PII 
> information ( like name, email , phone number etc ) ,
> It needs to mark data classification( classify columns in attribute )  for 
> that object as PII  and then propagate if any other child objects created 
> from that by asynchronously scan data automatically.
> The Scan process classify a column as PII or relevant info.
> b. Also do atlas support include / exclude option for tables , topics etc ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (ATLAS-4808) Automatic data classification Support by Atlas

2023-11-06 Thread Jagadesh Kiran N (Jira)


 [ 
https://issues.apache.org/jira/browse/ATLAS-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jagadesh Kiran N updated ATLAS-4808:

Description: 
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of the column contains 
PII Data. Can Atlas product will automatically scan data inside and if it finds 
any PII information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

b. Also do atlas support include / exclude option for tables , topics etc ?

  was:
a.   We are maintaining a Datalake , Data we store in folders in HDFS and from 
there hive takes the data and stores in tables.

In one of the Hive table we have bunch of columns. one of them is  child_id 
which contains PII Data. Can Atlas product will automatically scan data inside 
and if it finds any PII information ( like name, email , phone number etc ) ,

It needs to mark data classification( classify columns in attribute )  for that 
object as PII  and then propagate if any other child objects created from that 
by asynchronously scan data automatically.

The Scan process classify a column as PII or relevant info.

b. Also do atlas support include / exclude option for tables , topics etc ?


> Automatic data classification Support by Atlas
> --
>
> Key: ATLAS-4808
> URL: https://issues.apache.org/jira/browse/ATLAS-4808
> Project: Atlas
>  Issue Type: Wish
>Reporter: Jagadesh Kiran N
>Priority: Minor
>
> a.   We are maintaining a Datalake , Data we store in folders in HDFS and 
> from there hive takes the data and stores in tables.
> In one of the Hive table we have bunch of columns. one of the column contains 
> PII Data. Can Atlas product will automatically scan data inside and if it 
> finds any PII information ( like name, email , phone number etc ) ,
> It needs to mark data classification( classify columns in attribute )  for 
> that object as PII  and then propagate if any other child objects created 
> from that by asynchronously scan data automatically.
> The Scan process classify a column as PII or relevant info.
> b. Also do atlas support include / exclude option for tables , topics etc ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)